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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 



2. BACKGROUND 

* 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 

■ 

discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fcs* 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 

* 

genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
25 and products dependent on DNA and amino acid sequences. 



3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQIDNO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n - 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is mymine; and N is any of the four bases. In the amino acids provided in 
10 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1786 and 3 573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
15 specific domain or truncation of the peptides encodedby SEQ ID NO:l-1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ IDNO.1-1786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ IDNO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotidethat contains the segment. The collection 
can also be provided in a computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic acid 
30 sequences recited above; cloning or expression vectors containingthe nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DN A or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO : 1 - 1 786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrathet al., Science 258:52-59 (1992), as expressed sequence tags for 

♦ 

physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l-1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 - 1 786 and 3573-5358. The polynucleotides of the 

1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

* 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set J 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the ? 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RN A, their chemical analogs and the like . For example, when the expression of an mRN A is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 

mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

i 

markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
30 which comprises the step of aoministering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
3 5 expression or biological activity . 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
10 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as ! 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 

* 

activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
1 5 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
. and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capabihty of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 en2ymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
3 5 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

10 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

1 5 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or thfe 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic . 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

4 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 

■ 

residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 1 00 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
35 preferably from about 1 5 to about 50 nucleotides, more preferably from about 17 to 30 

: 

a 

4 



WO 01/53312 PCT/US00/34263 

nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts ofmRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 

1 0 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1 989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

15 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO.l- 

20 1 786 and 3 573-5 3 5 8. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer. The probability that the twenty-fivemer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (U4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 

3 5 detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated' 5 refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 

■t ; 

acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 

20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 

30 produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, c g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms ,t purified ,, or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

15 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from • ! 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or ^ 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 3 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a - 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
15 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
{e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

3 5 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60° C (for 23 -base oligonucleotides). 
5 As used herein, "substantially equivalent 1 ' can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% (/.e., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

15 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by " 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no ' 

* 

more than 1 0% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridi2ation conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 

35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 

1 0 determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 



1 5 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO:1787-3572 and 53.59-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
irnmunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1-1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. ; 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences " 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 

» 

preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences,but also include allelic and species variations thereof. Allelic and species 

5 variations can be routinely determined by comparingthe sequence provided SEQ ID NO: 1-1786 
and 3 573-53 58, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO:l-1786 and 3573-5358, can be obtained by searching a. database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA version 3 search 

against Genpept, using Fastxy algorithm 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

• The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices {e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., X 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

» 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
10 protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
1 5 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
. art. Accordingly, the invention also provides a vector including a polynucleotide of the 
20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
3 5 available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, 
pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 

i 

more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 

employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine . 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
10 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 
1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO.1-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 1 0, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3 f sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3 1 untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures ,v. 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can*, 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g. , 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
25 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6~adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, . 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

30 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5 -methy 1-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methy 1-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

35 nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g. , by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic adininistration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Tnoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et a l. (1987) 
FEBS Lett 21 5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA to which they have a complementary region. 
Thus, ribozymes {e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et ah U.S. Pat. 
No. 4,987,071; and Cech et ah U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et ah, (1993) Science 261 ;141 1 -141 8. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. etah (1992) Ann. N.Y.Acad Sci. 660:27-36; and 
10 Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et ah (1996) BioorgMed 
15 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral ! 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
20 standard solid phase peptide synthesis protocols as described in Hyrup et ah (1996) above; 
Perry-O'Keefe et ah (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
25 PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et ah (1996), above; Perry-O'Keefe (1996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5K4-memoxytrityl)ammo-5*-deoxy-%imdine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment'(Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem 
Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

1 5 cell membrane (see, e.g. , Letsinger et al, 1 989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1 987, Proc. Natl. Acad Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g. , PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5 : 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g. , a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 

polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
3 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell canbe a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 

15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present * 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

20 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B, subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 

25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

35 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Juxkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylate of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 

10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyi-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

25 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO: 1-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 



WO 01/53312 PCT/US00/34263 

set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at. 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et aL, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
1 5 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g. , pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides-^ 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
3 5 retain protein activity in whole or in part and are useful for screening or other immunological 

30 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1 555 (1 987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

10 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl, 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 

1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 

AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 21 5:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

3 5 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The " s 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e f g, cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and en2ymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences,by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 

as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotidesof the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DN A sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publications. WO 94/12650, PCT International PubUcationNo. WO 92/20808, and PCT 
10 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 

a. 

express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene 5 s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 

5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus mymidine kinase (TK) gene or the bacterial 

1 0 xanlhine-guanine phosphoribosyl-transferase (gpf) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwinet al.; International ApphcationNo. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApplicationNo. PCT/US90/06436 

15 (WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

* 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

3 5 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 

known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 

through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 

models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 

invention in vivo, one or more genes provided by the invention are either over expressed or 
10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 

244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 

referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 

animals are useful to determine the roles polypeptides of the invention play in biological 

processes, and preferably in disease states. Transgenic animals are useful as model systems to 

identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human? 

mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic anitiaals can be prepared wherein all or part of the polynucleotides of the 

invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 

methods described above. Activation can be achieved by supplementing or even replacing the 
25 homologous promoter to provide for increased protein expression. The homologous promoter 

can be supplemented by insertion of one or more heterologous enhancer elements known to 

confer promoter activation in a particular tissue. 

■ 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 

5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 



4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 

15 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/11, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
10 Humans); Takai et al, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
1 5 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-v, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1 :405-41 1, 1981 ; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 



4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma ceils transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 

5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mKNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 

proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
3 5 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10,5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 
complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 

5 Proc. Natl. Acad. Sci. USA 89:5907-591 1 , 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and BriddeU, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 

10 Wiley-Liss, Inc., New York, N.Y. 1 994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 

useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

15 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include " 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer f s, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 

■ 

conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 

growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

W09 1/0749 1 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

m 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

3 5 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 

10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 

15 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et al., > - 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T ceils may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies - 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II -* 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction^ 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T* 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
1 0 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 1 82:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 1 3 :795-808, 1 992; Gorczyca et al., Leukemia 7:659-670, 1 993 ; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itohet al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 11-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 



4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating.hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

15 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et aL, Nature 321:779-782, 1986; Vale et al., Nature 

20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 



4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

1 0 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J: Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 

t 

1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 
5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

10 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

1 5 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically . 

15 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
3 5 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Rruisbeek, D. H. Margulies, E. M. 
10 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel v 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Eirzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10-13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 

which are well known in the art 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 

review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol. 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for j 
the expression of the receptor of the invention: one cell population expresses the receptor of the ^ 
invention whereas the other does not. The response of the two cell populations to the addition of : 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

*» 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., IB. Lippincott Co., Philadelphia). 



4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 

* * * 

callosum), and alcoholic cerebellar degeneration; 
25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

(iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 

10 in a vaccine composition to raise an immune response against such protein or another material or 

# 

entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be _ 

r 

used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about O.Ol^ig/kg to 100 mg/kg of bocjy weight, with 
the preferred dose being about O.ljug/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 

carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 

< ... " 

may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts,, buffers, stabilizers, solubilizers, and other materials well known in the art. The term . 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, LL-9, IL-10, EL-1 1, LL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-0), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to niinimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to rnmimize side effects of the dotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 
1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and athninistration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

3 5 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or antithrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

1 0 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in. 
a depot or sustained release formulation. In order to prevent the scarring process frequently * 
occurring as complication of glaucoma surgery, the compounds may be administered topically, • 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician <to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a soUd carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
1 0 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
1 5 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral adniinistration, the compounds can be formulated readily by combining the 
3 5 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pynrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

4 5 added to the tablets or dragee coatings for identification or to characterize different combinations 

* 

of active compound doses. 

* Pharmaceutical preparations which can be used orally include push-fit capsules made off 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 

■ : 

suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

35 injection may be presented in unit dosage form, e.g. , in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 

stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
1 0 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
15 retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
3 5 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

* 

Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 
10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutical^ 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

<■ 

properties of the free acids and which are obtained by reaction with inorganic or organic bases ^ 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and" 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the < 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 

herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be a&ninistered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ug to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0.1 ug to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredientaontaining composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 

> 

carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ot and TGF-p), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 5 o (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.l . Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend oh individual characteristics and route of administration. However, HPLC assays or 

15 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 

> 

time, preferably between 30-90% and most preferably between 50-90%. In cases of local- 
administration or selective uptake, the effective local concentration of the drug may not be 
20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 

*•*..: -•-.'■<• 
... 4 ^ ....... 

will be in the range of about 0.01 p.g/kg to 100 mg/kg of body weight daily, with the preferred 

■" • . - " • ' • •• - i •■ - ' v-. * ■ . ..... v . .. ; - ... 

• m ... 

dose being about 0.1 |ug/kg to 25 mg/kg of patient body weight daily,, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab , F a t>- and F (ab ')2 
fragments, and an F ab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgG,, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 1 57: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
10 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

1 5 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, , 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 

immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 

.... . / 

bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobUized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 

a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are men fused wim . 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Princi ples and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
10 enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

1 5 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI- 1 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

♦ 

those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,81 6,567; Morrison, Nature 368 . 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 



5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for aobninistration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab') 2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature. 321:522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science. 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those off a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobuUn (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol, 
2:593-596 (1992)). 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al, 1985 In: 

Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. BioL 227:381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 

> * 

* 

introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 

endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
1 0 challenge, human antibody production is observed, which closely resembles that seen in humans 

in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 

is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 

5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 

(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
15 Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 

Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals v 

* 

which are modified so as to produce fully human antibodies rather than the animal's endogenous^ 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 

■ ' -.-».. ■ . ; ; . • - - f „ v > • ' ; •. r '• v * ... . . » % 

are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
25 transgenic animals containing fewer than the full complement of the modifications. The 

TK/I 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobuhn heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b')2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F W2 fragment; (iii) an Fab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

00 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and mTrauneckere/a/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. * : 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin > ■■ 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
20 aL, Methods in Enzvmology. 121 :210 (1986). 

According to another approach described in WO 96/2701 1 , the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 

, . - . ... ... [ 

CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 
30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab' fragment 
1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 

alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

heavy-chain variable domain (Vh) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
25 the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
30 antibodies can be prepared. Tntt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 ' binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1 191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 

212 Bi, ,31 I, ,31 In, 90 Y, and 186 Re. 

1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

1 5 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as 1 ,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon-14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 W094/1 1026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptayidin) for utilization in tumor pretargeting wherein the, antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then adniinistration of a "ligand" (e.g., avidin) that is in turn 

25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

■ 

10 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et al., J. MbL Biol. 215:403-410 (1990)) and. 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 

• - » * ■ 

♦ 

is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 
25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

1 5 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 

shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 

from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

< ■ 

polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise * - 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide^ 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers • 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

detected in the sample. 

■\ - - , 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 

5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1 985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 

1 0 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 

i 

sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or tiaDer. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et.al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 
10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 

* 

acid. In detail, said method comprises the steps of: 
15 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a- 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

« 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 

5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

1 5 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 

■ , , . * ■'.•„. . . i ■ * ■ ■ 

of the present invention. As described above, such agenls can be randomly screened or _ _ 

4 t . * ' V 1 " ■ ' 

rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4 .19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
. 15 expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

\ : ■ ••-,{.. 1 ' * 

r * * 

Other means for producing specific hybridization probes for nucleic acids include the 
25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides,i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesizedby standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagatae* al, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin i: 
mteraotionas a linker. For example, Broude etal (1994) Proc^Natfc Acad/Scl: USA 91(8) 3072-6, 
describe the use of biotinylatedprobes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylatedprobes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, EL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed CovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-endby aphosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussenef a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5 -end has 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobiHzationusing 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms co valently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidatebond, the oligonucleotide terminus must have a 5 -end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidinused to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (l-Melm;), is then added to a final concentration of 10 mM 1-Melm7. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-ethyl-3-(3-dimethylaininopropyl)-carbodiimide(EDC), dissolved in 

15 10 mM 1 -Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 

50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are ^ 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed . 

■ • 

3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

* 

It is contemplated that a further suitable method for use with the present invention is that 
20 describedinPCT Patent Application WO 90/03382 (Southern& Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

■ 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 

nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 

arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
30 FodoxetaL (1991) Science 25 1(4995) 767-73, incorporated herein by reference. Probes may also 

be immobilized on nylon supports as described by Van Ness et al (1 99 1) Nucleic Acids Res. 

1 9(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochem. 

169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 991), 
requires activation of the nylon surface via alkylation and selective activation of the 5*-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotidesis to utilize the 

5 light-generated synthesis described by Pease et al, (1 994) PNAS USA 91 (1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5'-protected A/'-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 

10 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC AOT FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or Y AC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al. (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/ or 
prepared directly from genomic DNA or cDN A by PCR or other amplification methods. Samples 
20 may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be 

prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrook et 
al. (1989), shearing by ultrasound and NaOH treatment 
25 Low pressure shearing is also appropriate, as described by Schriefer et al. (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al. (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease Cv/JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 1 9 (2688 base pairs). Fitzgerald et al. (1 992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI* * digest of pUC 1 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts py GCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is . 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-9O°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 

■ 

20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

422 PREPARATION OF DNA ARRAYS 

1 Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 

screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Ob tained Erom Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5* sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDN A Ends) was performed to further extend the sequence in the 5' direction. 
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5.L2 EXAMPLE 2 

Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 114, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
10 with BLAST score greater than 300 and percent identity greater than 95%. - 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia,edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was - 
checked using FASTY arid/or BLAST against Genebank. Other computer programs which may * 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the variolas tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. ' 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of then- 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank(i.e.,dbEST version 117, gbpri 117, 
UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 

Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

4 

was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 5.3.2 EXAMPLE 5 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 7, gb pri 1 1 7, 

UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 

The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 

shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
1 0 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 

the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 

the domain found, the description, the p-value and the pFam score for the identified domain 

within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using F ASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
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UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ IDNOS: 1653-1745. 
5 Table 1 shows the various tissue sources of SEQ ID NO : 1 653 - 1 745 . 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The results showed 

» m 

homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
15 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) - 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 

■ - 

their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 19, gb pri 1 1 9, 
5 UniGene version 1 1 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 
10 The homology for SEQ ID NO: 1746-1768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
1 5 Using eMatrix software package (Stanford University, Stanford, C A) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the positions) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26( 1 ) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6,2 EXAMPLE 8 
Novel Nucleic Acids 

■ 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 120,gbpri 120, 
UniGene version 1 20, Genpept release 1 20). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

* 

Table 1 shows the various tissue sources of SEQ ID NO : 1 769-1 786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 1 9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Cpmp. 
20 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 

Tissue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



adult brain 



GIBCO 



AB3001 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 



202-203 212-214 
251 258 268-269 
298 301 321 326 
357 362 369 
443 459-460 
500 503 519 
608-609 613 
652 657-658 
695 697 710 
796 804 811 
900 912 919 
962 979 988-989 996 
1008 1018 1039 1047 



225-226 235-236 
272 280-281 295 
331-332 334 356- 



379 
473 
526 
618 
660 
715 



1067 1070 
1116-1117 
1149 1151 



1234 
1279 
1312 
1361 
14 00 
1494 
1517 
1549 
1623 
1649 
1734 
1771 



1241 



1288-1290 
1320 1323 
1368 
1417 
1501-1503 
1522-1524 
1565 1578 



382-383 
475 477 
547 574 
633-634 
669-671 
724 731 
857-859 862 
922 924-929 
1001 
1059 

1078 1082 1107 
1131 1134-1137 
1157 1180 1206 
1243 1258 
1294 
1330 



1373-1375 
1446 1468 



1379 
1482 
1506-1507 
1530-1533 
1598 1606 



1625 
1653 
1741 



1627 
1664 



1639 
1667 



416 423 
488 496 
582 587 
645-646 
678 687 
775-777 
869 899- 
933 936 
1004- 
1064 
1113 
1140 
1229 
1272-1273 
1307-1308 * 
1356 1360- 
1391 
1493- 
1512 
1537 
1608 
1648- 
1696 



1643 
1671 



1743-1744 1760-1761 



adult brain 



GIBCO 



ABD003 



172 
208 
247 



297 
323 

356-357 
419 424 



3 12-14 18-19 25 30-31 
45 50-51 56 58 60 65-66 
82 85 87 92 104 107-108 
115-116 123-124 131-132 
139 142 146 148-149 152 
159 163 165 167 169 
193 196-197 199 203 
214 223 233 235-237 
261 268-269 272 276 
288 291-292 295 
307 317 320-321 
333-334 345-349 
393 401 408 414 
430 433-436 438-439 443 445 
453-454 459-461 468 471-473 
478 483 491 494 496 500 503 
508 516 519-520 525-527 534 
540 542-543 545 553 555 560 
570 574-576 586-588 593 595 
601 €06-609 616-620 622-623 
628-633 635-636 643 645-649 
655-656 660-665 
687 701 710 715 
743 745-746 750 
773 775-778 786 
802-803 810-811 
832 834-836 840 

874 
908 
929 



34-36 43- 
68-69 80 
112-113 
135-137 
154 157 
180 192- 
210 212- 
257 259 
280-281 284- 
300-301 304 
327 329-331 
379-381 
426-428 
449 
476- 
507- 
536- 
569- 
597 
625 
653 
681 
735 



861 864 869 
902 904-905 
922 924-927 
941-942 945 
977 979-980 



955-958 
985-986 



668-670 676 
717 724-728 
753 759 
789 796 
815 817 
845-847 
878 883 
911-914 
932-934 
963 
990 



765-766 
799-800 
820-821 
851 858- 
897 901- 
916 921- 
936-939 
966-969 



992-993 

997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Orism 



RNA Source ) Hyseq 

Library Name 



SEQ ID NOS: 



1097 
1117 
1134 
1158 
1190 
1217 
1241 
1267 
1289 



1103 1107 
1119 1121 
1144-1145 
1167 1170 
1193-1194 



1316-1320 
1344 1348 
1374 1377 
1394 1400 
1425-1427 



1220 
1243 
1269 1279 
1293-1294 
1326 
1351 



1226-1227 
1247 1252 



1109 1112 1116- 
1124 1127 1130 
1149 1151 1157- 
1178 1184 1188 
1200 1202 1215- 
1229 1231 
1258 1263 
1281 1284 1286- 
1306-1307 1312 
1333 1338 1341 
1355-1357 1368 
1380 1386 1389-1390 
1409 1414 1422-1423 
1437 1443 1446 1454 
1456 1458-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1517 
1522-1524 1530-1533 1545-1546 
1548-1550 1552 1557-1559-1563 



1565 1567 1569 
1591 1593 1595 
1611 1620-1621 
1630-1632 1636 
1645 1647 
1664 1667 
1686 1690 
1711 1719 
1731-1733 
1747 1749 



1649 
1669 



1571 1586 
1598-1601 
1624-1626 
1640-1641 
1653-1655 
1673 



1588 

1608 

1628 

1644- 

1657 



1678-1681 
1694-1696 1701 1709 
1722-1723 1726-1727 
1738 1740 1743-1744 
1753 1757-1758 1760- 



1761 1765 1771 1785 



adult brain 



Clontech 



ABR001 



adult brain 



Clontech 



adult brain 



"cIontecE" 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1127 1136-1137 1144- 
1238-1239 1280 1293 
1355 1361 
.1448 1456 
1609-1610 
1653 1754 



1112 
1147 
1320 
1400 
1570 
1626 
1786 



1121 
1231 
1345 
1417 
1572 
1645 



1383-1384 
1476 1507 
1614 1620 
1759 1770 



ABR006 



"5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-33B 341 352 357 359- 
360 362 369 374 379 3B4 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1S86 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



ABR008 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 


SEQ ID NOS: 








208 


210 


214 


-215 218 


221- 


-226 229 








231- 


-232 


234 


-24 


1 245 


-247 


251-253 


■ 






255 


257 


-259 


268-269 


271 


276-281 








285- 


-286 


288 


290-292 


300- 


•302 304 








307 


309 


-311 


313 315 


317- 


■318 320- 








322 


325 


-326 


32 


8 330 


-331 


333-338 








341 


344- 


-347 


34 


9 352 


354 


356-357 








362 


369- 


-373 


376 379 


-380 


382 384 








387 


390-391 


393-394 


397 


399-403 








405-411 


414 


-415 417 


-420 


426-428 








437-438 


440 


-444 453 


-455 


462 464 








467 


469-471 


476 478 


482-484 488- 








491 


497 


503 


506-513 


516- 


517 520 








524- 


526 


528 


-53 


0 532 


-534 


537-540 




- 




. 542 


544 


547 


-551 553 


561 


565-567 








572- 


•574 


577 


581 585 


587- 


588 590- 








591 


597 


599 


60 


1-602 


606- 


610 612 








615- 


617 


619 


-62 


0 622 


-623 


628-629 








631 


633- 


-634 


63 


6-641 


643 


645-647 








651- 


653 


655 


-664 669 


-671 


673 679 








632 


687 


689 


691-700 


702 


706 710 








715- 


717 


720 


-721 725 


-734 


736-739 








742- 


743 


746 


750-752 


756 


758-759 








762- 


764 


766 


76 


8 773 


-778 


780-782 








784- 


785 


787- 


-789 794 


796 


799 802- 








803 


805 


811 


814-815 


818 


825-826 








834- 


837 


839- 


-84 


0 842 


-843 


856-859 








861- 


862 


865 


86 


7-872 


874- 


875 881 








883- 


884. 


887 


889-892 


894- 


895 897- 








898 


901 


904 


90 


8 910 


912 


914 917 








919 


921- 


924 


926-927 


930- 


932 935- 




• 




941 


943 


945 


949 953-954 


958 961- 








963 


967 


969 


971 975 


977 


981-983 








986 


988- 


990 


992 997 


999- 


1002 








1004 


-1006 1008 


1012 


1018 


-1023 








1027 


1029-1031 


1035-1037 


1047- 








1048 


1053 1057 


1059 


1063 


1068 








1070 


1072-1075 


1077 


1081 


-1083 








1085 


-1093 1095 


-1096 


1108 


-1112 








1114 


-1125 1127 


1131- 


-1133 


1135- 




■ 




1138 


114 


2-1145 


1148- 


-1158 


1160- 


• - 


- 




1163 


1167 1169 


1172 


1175 


1177 








1180 


118 


3-1188 


1191- 


-1195 


1199- 








1200 


1204 1206 


1211 


1213 


-1216 








1222 


-122 


3 1226- 


-1227 


1229 


-1231 








1234 


-123 


5 1241- 


-1242 


1244 


-1263 








1266 


1269-1271 


1276- 


•1277 


1279- 








1281 


1284-1286 


1292 


1294 


-1295 








1299 


1305-1309 


1312 


1314 


1316- 








1319 


1322 1324- 


-1327 


1330 


1332 








1334 


-1335 1339 


1344- 


1346 


13 51 








1354 


-1355 1357- 


-1358 


1365 


-1367 








1369 


-1370 1373-1374 


1376 


-1379 








1381-1384 13 


86-1388 


1392 


13 94 








1396-1397 1400 


1403- 


1407 


1410 








1414 


1419-1420 


1423 


1432- 


-1433 








1435 


143 


7-14 


38 


1440- 


1442 


1446 








1448 


1453-14 


55 


1457 


1461 


1463- 








1464 


1466 14 


68 


14 71 


1477 


1480 








1482- 


-1483 14 


96 


1502- 


1504 


1507- 








1509 


1513 1519- 


1520 


1524- 


•1526 








1536 


1547 1549- 


1552 


1567 


1573- 








1574 


1578 15 


86- 


1589 


1597- 


-1598 








1601- 


•1602 16 


05 


1607- 


1609 


1611- 








1617 


1619-16 


21 


1623 


1625- 


•1626 








1635- 


-1641 1643- 


1645 


1649 


1651 








1653 


1656-16 


58 


1664 


1669 


1671- 








1674 


1676-16 


84 


1686 


1689- 


■1690 








1694- 


1696 17 


04- 


1705 


1708- 


•1709 
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Tissue Origin | RNA Source 



adult brain 



Clontech 



adult brain 



BioChain 



adult brain 



Invitrogen 



Hyseq 
Library Name 



ABR011 



SBQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 17B6 



ABR012 



ABR013 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



185 204-205 364-365 393 497 595 
687 692-694 B30 845 1068 1320 
1413 1640 



adult brain " 1 Invitrogen 



ABR014 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



adult brain Invitrogen 



ABR015 



419 434-435 441-442 763 789 983 
1320 



adult brain 



Invitrogen 



ABR016 



312 364-365 379 
1674 1722 1785 



1320 1334-1335 



adult brain Invitrogen 



ABT004 



cultured 
preadipocytes 



14-16 22-23 
70-72 78 86 
137 143 146 
196 198 



194 
295 
338 
371 
399 
482 
557 
655 
696 
750 
814 



25 37-39 
94 107 1 
152 161 
210 218 



298 309-310 320- 
346-347 349-350 



Strategene 



ADP001 



379-380 382-383 
401 408 428 438 
502 507-509 
597 602 607- 
669 671-672 
712 715 721 
766 778 780- 
830 837 841 
925 937 949 
968-969 988- 
1016-1019 
1086 1090 
1120-1121 1123- 
1140 1144-1147 
1174 1188 1193- 
1229 1231 1254 
1285 1309 1312 
1343-1344 1356- 
1383-1384 
1434 1442 
1454 1470-1472 
1528-1529 1532 
1557-1559 1561- 
1588 1590 1595 
1610-1613 1615 
1640 1644 1647 
1670 1675 1696 
1727 1738 1760 
1785-1786 



490 
562 
667 
710 
753 
826 
894-895 
961 963 
1005-1006 
1037 1052 
1115 
1137 
1170 
1225 
1280 
1341 

1378-1379 
1423 1429 
1452 
1525 
1554 
1585 
1608 
1627 
1666 
1723 
1779 



43 58 60 
13 116 136- 
173 182-184 
229 259 267 
321 324 336- 
356-357 362 
391 393 396 
459 461 476 
516 526 531 
609 624 652 
687-689 695- 
732 739 743 
781 789 803 
857 869 874 
954-956 960 
989 1000 
1021 1036- 
1109 1113 
1124 1136- 
1151 1167 
1194 1205 
1258 1262 



1334-1335 
-1357 1370 
1403-1404 
1448 1451- 
1482 1499 
1536 1547 
-1562 1567 
1601-1604 
1619 1624 
1660 1664 
1704 1715 
-1761 1768 



5-8 11 17 25 68 
105 110 116 136 
189 196-198 261 
301 318 331 336 
400 428 430-431 
527 549 557 561 
631 637 647 670 
748 782 793-794 
845 858-859 879 
960 982 986 995 
1005-1007 1025 
1039 1045 1071 
1102 1136-1137 



-69 80 82 87 103 
-138 168 171 188- 
267 276 288 293 
-338 379-380 391 
510-512 520 524 
602 618 620 622 
681-682 710 731 
817 834-836 843 
882 893-895 934 
-996 1000 1002 
1027-1028 1032 
1078 1097 1099- 
1140 1219-1220 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1260 
1322 
1370 
1437 
1602 
1660 
1711 
1760- 



1271 
1329 
1371 
1466 
1608 
1662 
1719 
1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



1314 
1365 
1423 
1539 
1649 
1688 
1746 
1771 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



adrenal gland 



Clontech 



ADR002 



4-10 15-16 25 29-31 
51 55 60 62-63 65-6 
116 11B 122 126 130 
170 181 192 198 201 
228 247 251 255 267 
281 285 295 298 311 
349 351-352 354 372 
391 400 410 415-416 
431 434-437 439 445 
477 483 491 493 497 
519 527 535 546 549 
581 588 595 600 602 
628-630 637 645-646 
713 715 719 732 734 
773-778 789 
869 875 883 
930-931 942 
976-977 981 
1004 1049 1055 
1076 1112-1113 



816 
898 
948 
990 



1134-1135 
1181 11B8 



1227 
1280 
1325 
1348 
1387 
1426 
1463-1464 
1538 1546 



1151 
1209 
1231 1243 
1285 1290 
1327 1330 
1365-1366 
1398 1400 
1436 



1598 
1627 
1671 
1703 
1765 



1609 
1634 
1674 
1717 



829 
904 
952 
992 
1059 
1115 
1158 
1218 
1270 
1293 
1342 
1369 
1405 
1440-1441 
1488 1491 
1567 1573 
1614 1618 
1636 1649 
1678-1679 
1727. 1731 



43-45 47 50 
6 75 80 102 

137 150 169 
-203 215 227< 
-269 271 280 
336-338 342 
-373 383-385 
424 426-427 
454 461 473 
-498 503 516 
552 572-573 
608-610 620 
670 679 703 
744-746 758 
837 845 848 
912 922-923 
965 967 969 
-993 1001 
1071-1072 
1121 1127 
1163 1175 
1224-1225 
-1271 1274 

1307 1324- 
-1343 1345 
1378-1379 
1417 1425- 
1444 1454 
1507 1512 
-1575 1588 
1622 1624 
1651 1658 
1691-1692 
-1732 1737 



adult heart 



GIBCO 



AHR001 



4-8 10-11 1 
46 50-52 57 
85 87 89 94 
110 112 114 
127 130-132 
147-151 153 
185 192 195 
215 220 225 
236 251 257 
277 280-282 
29B-301 304 
325 330 333 
352 354 358 
384 387-388 
408-409 411 
433-439 445 
457 459 462 
483-484 487 
503 506 508 
526 534 536 
560-562 574 
587 589 593 
612 615-620 
645-652 656 
674-675 683 
701 709 712 



5-16 18-21 3 
-58 60 62-63 
97 100 103- 
116 118-119 
134 136-138 
163-164 168 
197 199 204 
-226 229-230 
-260 262 265 
285-266 289 
307 309 314 
336-338 345 
361 368 370 
391 393 397 
-412 414-416 
-446 449 452 
469 472-473 
-490 492-493 
510-513 516 
-540 542 546 
-577 581-582 
595 597 604 
622-623 626 
-660 665-666 
-684 687 692 
715-716 719 



4-39 44- 
71 75 82 

104 108- 
122-123 
141-144 

-171 179 



-205 
232 
272 

-292 
321 
349 
380 
401 



212- 

234- 

274 

296 

324- 

351- 

383- 

406 



430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
-609 611- 
632 637 
670-672 
-694 697 
-720 725- 
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| Tissue Origin 


RNA Source 1 


Hyseq 






SEQ 


ID NOS: 




| 




Library Name 




















726 728 73 


0-732 


735 


738-739 743- 








744 746 751 753 


759 


761 765 770- 








771 775-78 


0 785 


788- 


790 7 


96 802 


I 






804 810 812 817 


821 


826 8 


28 830 








837 843 845-847 


849- 


853 857-861 








863-864 86 


9 871 


875 


877-879 881 


1 






883 887 890-892 


894- 


895 897-898 








901 903 906-907 


911- 


913 915 919 


1 






921-925 927-928 


933- 


935 945 958 








961-963 967 969 


-972 


975 977-978 








980-986 990 992 


999- 


1002 


1005- 








1007 


1010 


1016 


1019- 


1020 


1022- ' 


! 






1023 


1025 


1028- 


1037 


1039- 


1040 








1043 


1047 


1050 


1054- 


1055 


1057 








1059 


1063- 


1064 


1067- 


1068 


1070 








1072 


1075- 


1076 


1083 


1085- 


•1087 








1089 


1093- 


1094 


1104 


1106 


1108- 








1109 


1113 


1116- 


1117 


1119 


1121 








1124 


1126 


1128 


1131- 


1134 


1144- 








1145 


1148- 


1149 


1151 


1158 


1167 








1169- 


►1170 


1175 


1177 


1192 


1196 








1199- 


•1200 


1202 


1206- 


1208 


1211 








1216 


1218 


1222 


1227- 


1229 


1232- 








1235 


1238- 


■1241 


1243- 


1244 


1247- 








1248 


1250 


1253- 


•1254 


1256- 


-1258 








1261 


1268 


1270- 


•1271 


1277 


1280- 








1282 


1287 


1292 


1298- 


1299 


1306 








1308 


1317- 


•1321 


1324- 


•1325 


1330 








1332 


1334- 


-1337 


1339 


1344- 


-1345 








1349- 


-1350 


1354- 


•1356 


1359- 


-1360 


I 






1365' 


-1366 


1369 


1371 


1374 


-1375 


i 




• 


1378-1380 


1383- 


•1384 


1389 


1397 


i 






1400 


1403 


1409 


1417 


1423' 


-1426 








1437 


1439 


1442 


1444 


1446 


-1447 








1450 


1453 


1468 


1470 


1473 


1479 








1481 


1488 


1490 


1501- 


-1504 


1519 








1521 


1524 


1528 


1530- 


-1534 


1536- 








1537 


1539 


1541- 


-1542 


1547 


1553 








1555 


1560* 


1565 


1567- 


-1571 


1588 


[ 






1591 


1597 


-1598 


1601- 


-1602 


1605 








1614 


-1616 


1619- 


-1520 


1623 


-1628 








1630 


-1632 


1634 


1636 


1641 


1644- 








1645 


1647 


1649 


1652- 


-1655 


1659 








1662 


1667 


1673 


-1674 


1680 


-1681 








1684 


1686 


-1688 


1704 


-1705 


1709 








1711 


-1712 


1717 


1724 


1726 


-1727 








1731 


-1733 


1737 


-1738 


1741 


1743- 


i 






1744 


1749 


1754 


-1755 


1760 


-1761 


1 






1765 


1772 


1785 








! adult kidney 


GIBCO 


AKD001 


4-8 


10-11 


17-2 


1 29- 


31 35 


-39 42- 






45 50-51 


56-58 


60-61 64 


68-69 75 








77 80 82 


85 87 


92-94 97 


100 102- 








104 


107-108 112 116 


-117 


119 123 








127- 


133 136-13 


7 139 


-141 


143-144 . 








147- 


154 157 161-163 


165- 


166 169 








172 


176 178-179 192 


194- 


197 199 








201 


203-206 209-210 


212- 


213 215- 








216 


223-228 234-236 


238 


247 251- 








253 


257-259 261-262 


265- 


269 271- 








272 


274 276-277 279 


-281 


284-286 








290 


293 2 


95 29 


8-299 


301- 


302 304 








307 


311-3 


13 32 


1 325 


-326 


329-331 








333 


341 344 34 


8-350 


352 


356 358- 








359 


362 3 


64-365 368 


370- 


372 374 








376- 


377 3 


80-382 392 


395 


398 400- 








401 


404 4 


07-409 414 


-415 


423-424 








430- 


437 443-444 446 


449 


451 453- 








455 


459 4 


61-462 464 


467 


469 471- 








474 


476-4 


77 480-481 


483 


487-488 
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Tissue Origin 


RNA Source 


Hyseg 






SEQ 


ID NOS: 








Library Name 




















490- 


491 493 497 


-505 


510- 


513 516- 








520 


CO*5 ROA COg 


-529 


534 


537-540 








544 


547 549 cca 


-556 


560 


562 564 








567 


571-576 578 


582 


586- 


589 592- 








593 


598-599 601 


604- 


606 


608-613 








615- 


619 621-626 


632- 


634 


637-643 








64S- 


652 655 660 


-664 


669- 


672 676 








678- 


679 688 692 


-695 

*W mW mmT 


698 

mW mr 


702 711 








713 


717 719-720 


727 


731 


735-736 








738 


743 745-746 


751 


753 


755 762- 


s 




• 


763 


765 771-773 


775- 


778 


780 786 






• 


788 


793 795-796 


800 


803 


805 808 








810- 


812 8 


14-819 


321 


826 


829 832 








834- 


838 842-845 


848- 


855 

W0 mm* 


857-861 








864- 


865 867 86S 


B71 

mf * mJm 


874 


876-883 








886- 


887 889-891 


893- 


896 


898-900 








902 


906-908 910 


-914 


918 


920 922 








925- 


927 929-935 


937 


940- 


942 945 








948- 


949 951 953 


-958 

m* «■# W 


960- 

mm* V W 


961 963- 








964 


969-970 972 


976- 


978 

mW 9 V 


982-986 








988- 




j £t J J -J 


995- 

^ 


997 


999-1002 








1004 


J. 1/ L/ O 


inin 
j- <j j- w 


1012- 


1013 


1016- 








1017 


1019 


-1020 


1022 


1025 


-1031 








1035 


1038 

J. W J o 


-1040 

mm, U1 w 


1042 


1044 


1047 








1050 


1054 


-1055 


1057- 

mm r 


1064 


1068 








1070 


-1073 


1078 


1085- 

mm mw V mm* 


1086 


1088- 








1089 


1092 


1094 


1097 


1099 


-1102 








1107 


1109 

X Jt W p/ 


-1112 


1116- 


1119 


1121 








1123 


-1125 

Jb «*U 


1132- 


1135 


1140 

«*W mm ^m> W 


1142- 








1143 


1146 


-1147 


1149- 


1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 

mm- mm %^ 


1192 


1196- 








1200 


1202 


-1204 


1206- 


1211 


1216- 








1219 


1221 


-1222 


1225 


1227 


-1230 








1232 


-1234 


1238- 


1241 


1243 


-1244 








1246 


-1247 


1253 


1257- 

J*W " *^ # 


1258 


1260- 








1261 


1267 


-1268 


1270 


1272 


-1274 








1281 

mm *-m \r 


1283 


1287- 


1289 


1293 


-1295 








1299 


1306 


1308 


1311- 


1313 

mm> tmf m*m *m* 


1317- 


• 


* 




1320 


1323 


1329- 


1330 


1334 


-1335 






- 


1339 


1341 


1349- 


1350 


1353 


-1357 








1359 


1367 


1369 


1373 


1375 

mm. >mm f %f 


1378- 








1379 


1394 


1397 


1400 


1403 

mm m w 


1405 








1407 


-1409 


1417 


1419 


1423 


-1424 








1428 


-1431 


1433 


1437- 


1438 


1442- 




* 




1443 


1445 


-1446 


1448- 


1450 


1453- 








1454 


1459 


1461 


1465- 


1468 


1474- 








1475 


1478 


. 1484- 


1488 


1490 


1492- 








1493 


1495 


1497- 


1498 


1506 


-1507 








1509 


1512 


1518 


1521- 


1522 


1525 








1527 


-1528 


1532- 


1533 


1537 


1540- 








1541 


1547 


-1550 


1552 


1556 


-1559 








1561 


1565 


-1566 


1568 


1571 


1575 








1578 


-1579 


1583 


1586- 


1587 


1589 








1591 


-1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618 


-1622 


1624- 


1628 


1631- 








1632 


1634 


-1636 


1638- 


1639 


1641 








1644 


1646 


-1649 


1653- 


1656 


1662 








1664 


1666 


-1667 


1670- 


1671 


1676- 








1679 


1683 


-1684 


1686 


1691 


-1692 








1696 


-1699 


1701 


1709- 


1711 


1713- 








1714 


1716 


-1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748 


-1749 


1751 


1760 


-1761 








1763 


-1768 


1778 


1780 


1785 




adult kidney 


Invitrogen 


AKT002 


20-21 37-39 47 


52 57 


60 


65-66 








68-69 80 104 107-108 


122 


130 133 






136- 


137 140 142 


-143 


149 


169 174 
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Tissue Origin 



adult lung 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



GIBCO 



ALG001 



lymph node 



Clontech 



ALN001 



181 197 
261-265 
301 304-305 
344-345 349 
383 387 392 
443 445 449 
504 506 513 
540 546 554 
607 616-617 
664 695 709 
775-777 788 
838 849-850 
890-892 898 
925 927 934 
962 968 970 
1044 1052 
1073 1085 
1111 1113 
1136-1137 
1192 1196 
1256 1264 
1293-1294 
1325 1330 
1356 1369 



487-488 
528 536- 



227-228 235-236 244 251 
267 280-281 286 290 299 
309 312-313 339 341 
358 370-372 376 382- 
401 414 416 421 430 
453-454 472 
516 519 522 
585 587 594 598 602 
626-627 636 643 662- 
721 735 743 761 768 
796 804 814 827 837- 
852-853 869-870 881 
903 905-907 914 919 
941 949 952 957 960 
1000 1008 1029-1030 
1055 1063 1067-1068 
1099-1102 1107 HlO- 
1115 1119 1126 
1146-1148 1153 
1199 1232-1233 
1272-1273 1281 



1299 1312 
1344 1349 
1378-1379 



1419 1428-1429 1436 
1463-1464 1467-1468 



1478 
1529 
1623 
1647 
1670 
1776 



1486 
1534 
1629 
1652 
1673 



1491 
1547 
1631 
1660 
1686 



1509 
1596 
1634 
1664 
1709 



1320 
1351 
1403 
1446 
1470 
1519 
1600 
1638 
1667 
1727 



1134 
1159 
1241 
1285 
1324- 
1355- 
1414 
1458 
1477- 
1527 
1619 
1643 
1669- 
1740 



4-8 14 37-39 44-46 50-51 56 62- 
63 75 82 88 93 103-104 113 125 
133 140 143 150 152 154 157 162 
171-172 174-175 190-191 196 200 
211 214 219 223-224 227-228 251- 
252 256 265 272 274 280-281 285 
310 332 345 351 362 371 381-382 
394 408-409 431 436 445 454 459 
461 467 469 471 476-477 488 504 
513 527 537-540 544 547-548 554 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 



623-624 
695 716 



621 
670 

774 789 
837-838 
866 880 
966 971 
996 1001 
1045 1047 



803 
845 
887 
977 



1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 1606 
1627-1629 1632 1642 
1669 1676-1677 1684 
1731-1732 1737-1738 
1786 



1072 
1126 
1157 
1241 
1306 
1379 
1434 
1509 
1549 
1613 
1644 
1696 



1080 

1134 

1173 

1272- 

1320 

1383- 

1436 

1522 

1553- 

1624 

1662 

1727 



1748-1749 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481 
482 503 526 529 537-540 546-547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



young liver 



GIBCO 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



adult liver 



Invitrogen 



ALV001 



ALV002 



5-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
187-189 194-195 198 209 214 
258 267-269 280-281 



174 
215 
306 
374 
414 
493 
549 
607 
648 
717 
766 
814 
893 
919 
970 



230 
309 
392 
431 



250 
342 
394 
444 



510-512 



351 
398 
455 
516 



356 
401 
459 
520 
585 



571 574-577 
621-624 628-630 
660 666-667 678 
719 728 730 734 
770 773 779 788 
841 849-851 871 
898-900 902-904 
922 924 934 953 
984 986 997 1001 



359 362 372 
407-408 410 
476 478 483 
522 526 536 
592 601-602 
632-633 637 
697-698 700 
738 744-745 
800 808 812 
874 879 887 
906-907 911 
957 963 965 
1004 1007 



1012 


1029 


-1030 


1033 


-1034 


1052 


1061 


1066 


1070 


1076 


1086 


1089 


1093 


1099 


-1102 


1110 


-1112 


1116- 


1117 


1119 


1121 


1125 


1136 


-1137 


1144-1145 


1156- 


1157 


1159 


1196 


1199- 


1200 


1209 


1211 


1219- 


-1220 


.1241 


1244 


1262 


1270 


1275 


1279 


1283 


1295 


1317- 


1320 


1332 


1339 


1344 


1359 


1362- 


1363 


1379 


1383- 


1384 


1403 


1415 


1430 


-1431 


1437 


1450 


1467 


1475- 


1476 


1483- 


-1484 


1494- 


1495 


1498 


1505 


1512 


1516 


1518- 


1519 


i526 


1529 


1547 


1550- 


1552 


1557- 


-1559 


1555 


1583 


1587 


1597 


1609 


1614 


1620 


1631 


1637 


1641 


1644 


1654- 


1655 


1662 


1667 


1669 


1684 


1691- 


1692 


1702 


1711 


1725 


1738 


1741 


1743- 


-1744 


1758 


1760- 


1761 


1763- 


1765 


1769 





5-8 17 20-21 32-33 41 55 58 
75 77 86 89 102 108 117 119 

209 231 
284 306 
374 376 
433-435 
528 



430 



176 198 200 
272 275-276 
333 356 359 
414 428 
503-505 
561-563 
632 637 
707 710 
794 814 
850 858 
911 918 
978 986 
1053 1063 
1089 1093 



64 

175- 
250 
325 
408 
494 

517-518 528 534 544 552 
567 578 581 608-609 630 
644 650 661 665 672 702 
721-722 750 753 778 782 
820 826 834-837 847 849- 
861 874 879 893 898 904 
921-922 926 946 948 972 
996 1020 1027 1031 1034 
1068 1070 1073 1086 
1097 1113 1119 1156 



235-236 
316 321 
398 401 
454 476 
534 544 
608-609 
665 672 



1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1550 1567 1578 1581 1583 1594 
1597' 1601-1602 1611-1612 1615 
1618-1619 1621 1625 1637 1645 
1647 1652 1654-1655 1660 1666 
1669-1671 1684 1706 1722 1737- 
1738 1742-1744 1760-1761 1753- 

1765 1772 1774 

29 67 6 997 1063 1119 1536 1766 
1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-178 180 
189 192-203 207 209 
221-224 229-230 234 
247 255 258 260-262 
272 274 277-281 
295 299 301-302 



adult liver 



Clontech 



ALV003 



adult ovary 



Invitrogen 



AOV001 



313-314 316 321 
333 335-338 341 
356 358 360 362 
379-384 387 390-392 
400 403 408-410 412 
424 426-427 430-435 
448-449 451 453-455 



284-286 
304 307 
323-326 
344 349 
370-372 
394 



182-186 
211-215 
242-243 
265-269 
288 



188- 
219 
246- 
271- 
290 
309-311 
330 332- 
352-353 
376-377 
397-398 
414-416 423- 
439 443-446 
462-463 468- 



471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 



566-567 569-570 
579 581 583 585- 



599 601- 
624-627 
644-647 
667-675 



554-555 561-564 
572-573 575-576 
588 590-591 593 595 597 
605 607-613 615 618-622 
630 632-633 636-640 642 
649-652 654-655 657-665 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 



1001 


1004-1009 


1011- 


-1013 


1016 


1019- 


1020 


1024- 


1025 


1029- 


1031 


1033- 


1035 


1037 


1039 


1041- 


1047 


1050- 


1051 


1054- 


1060 


1062- 


1064 


1067- 


1070 


1072- 


1073 


1075- 


1076 


1078- 


1079 


1085- 


1086 


1089- 


1090 


1094- 


1096 


1098- 


1103 


1106- 


1108 


1112- 


1117 


1119- 


1120 


1123- 


1127 


1131- 


1135 


1142- 


1143 


1146- 


1149 


1153 


1156 


1158 


1163 


1165- 


1166 


1169- 


1171 


1173- 


1175 


1177- 


1178 


1180 


1183 


-1185 


1190 


-1191 


1195 


1197- 


1200 


1202 


1205 


-1214 


1217- 


1219 


1221 


-1226 


1232 


-1235 


1238- 


1241 


1243 


-1244 


1247 


1249 


1252- 


1254 


1256 


-1258 


1262 


1265 


1267- 


-1268 


1270 


1275 


1278 


1280- 


•1283 


1286-1289 


1291 


1293 


-1294 


1298- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult placenta 



Clontech 



placenta 



Invitrogen 



APL001 



JL 4. j P 


ijuo 


1308 


1312 


1 J 1 f - 


1 1 "J 0 


JL-5 Z. J 


A. J A I 


1329-1330 


0 "5 *0 1 


liii 


XJ JO 


j. j j j 


1341 


1343 


- Ubl 






Ijo X 


1365-1366 


1 O "71 
1J / 1 - 


i ooc 
1.5 /b 


T T77 - 

-Lo I f ~ 






"1J of* 


lo bb 


o o d a 
U89 


1 ~i QA 


i /inn 


X"± U*± 


If* 10 


-Ifll / 


14 iZ- 


1497 
X^xc, f 


Xfx<£-7 


1?J X 


if* Jb 


-If* JO 


143 9 — 




1 A^C- 
Iftf&D 


i a Kn 

1ft DU 


iftb j 


-Iftbfi 


1459 


T /izf o 
InOJ - 


1 4 iZA 
IfiOf* 


I A 

II DO 


o a c a 


1 A *7 A 

14 /U 


0 A 1 A 


1 A fl 0 
1*1 O 1 


1ft Of* ~ 


1 /IOC 
1ft t}3 


1/QQ 


1 /I O 1 

14 yi 


o a m 
14 93 - 


If* if ft 


14)7b - 


1 A QQ 
Ifk SO 


lb u l 


- lbU4 


lbUb- 


1 C A *7 


Ibxl - 


1 CIO 

1 b l / 


0 c i a 

1 b i y 




1524 


xbzb - 


t con 
lb«£ / 


n con. 


TOO 

13J 1 


i co >i 
lb Jft - 


1536 


lb J o - 


1 CIQ 


1 C/11 

Ibftl 


lbft o 


1.548- 


1550 




ibbb - 


lbb9 


Ibbl 


1 C^*0 


t e e £ 
1556- 


o e c *i 
lbb f 




lb /u 


1 C TO 

lb / z 


1574 - 


1575 


lb / o 


i c o n 
IboU - 


1581 


1587 


1 c o o 

-1588 


icon 

1590- 


lb y JL 


Ibyb 


1597- 


1598 


16U0- 


1606 




o d i 
Ibll- 


1621 


1623 


-1630 


1634 


lb J b 


1 CO Q 

loJo 


1641 


1643 


1 £ A C 

1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


1674 


1676 


-1681 


1683- 


1690 


1699 


1702- 


1707 


1710- 


1711 


1713- 


1714 


1716- 


1719 


1723- 


1724 


1726- 


1728 


1731- 


1733 


1735 


1737- 


1738 


1740- 


1741 


1743 


-1744 


1748- 


1751 


1753 


1755- 


1756 


1760- 


1762 


1765 


1767- 


1768 


1770 


-1771 


1776 


1778- 


1779 


1783- 


1784 


1786 





5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



adult spleen 



APL002 



14-16 26 
106 116 



198 210 
309 329 
423 430 
491 517 
738 746 
858 916 
1005-1006 
1068 1070 
1160 1277 
1345 1429 
1486 1490 
1592-1593 
1664 1673 
1746 1776 



29 43 60-61 
135 171 177 
216 235-236 
334 339 359 



434-435 448 
522 631 723 
769 818 843 
948 953-954 
1013 1033 
1086 1139 
1285 1317-1320 
143S 1438 1454 
1512 1519 
1602 1626 
1675 1722 



79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
1343 
1482 
1532 1549 
1647 1649 
1727 1730 



GIBCO 



ASP001 



3 5 


-8 12 


: 15- 


•16 


19-21 24 


29 


34-36 


44- 


45 57 


' 60 


82- 


83 87 89 


94 


98-99 


103 


106 


108 


117 


119 


-121 


139 


141 


147 


152- 


153 


155 


166 


169 


171 


174 


178 


-180 


196 


198 


201 


-206 


209 


-211 


215 


219 


234 


253 


-254 


256 


258 


264 


272 


280- 


281 


290 


295 


302 


309 


312 


325 


333 


341 


349 


358 


372 


382 


386- 


387 


394 


406 


414 


431 


434- 


436 


446 


448 


451 


473 


481 


490 


-493 


500 


503 


505 


517 


519 


530 


534 


536- 


540 


547 


554 


557 


574- 


576 


582 


592 


595 


604 


611 


-612 


520- 


621 


623 


631- 


632 


642 


652 


659 


661 


667 


671 


673- 


675 


684 


700 


721 


728 


730 


732 


73 8 


742 


-744 


746 


762 


765 


774 


780 


788- 


789 


794 


810 


-811 


817 


822 


330 


832 


845 


848 


852 


-853 


858 


862 


366 


874 


B79 


882 



115 



WO 01/53312 



PCT/US00/34263 



Tissue Origin | RNA Source | Hyaeq 

Library Name 



SEQ ID NOS: 



testis 



GIBCO 



ATS001 



Genomic DNA 
from BAC 63118 



Genomic DNA 
from BAC 39316 



Researcl 
Genetics 
(CITB BAC 
Library) 
Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



884 906-908 
927 934 942 
978 983 990 
1005-1007 1010 
1042-1044 
1070 1076 
1113 

1174 1177 



1109 
1170 
1220 
1246 
1301 
1339 
1364 
1417 
1474 
1512 
1560 
1651 
1674 
1727 
1761 



1226-1227 
1258 1269 
1320 
1349 
1369 
1434 
1477 
1522 
1567 

1654-1655 
1678-1679 
1733 173B 
1774 1779 



912 919 921-923 926- 
949 957-958 963 977- 
992-994 996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 



1012 
1046 1049 
1089-1090 
1115 1124 



1190 
1229 
1271 
1322 1330 
1351 1353 
1374 1386 
1436-1437 1439 
1480 1485-1487 
1525 1544-1549 
1591 1600 1631 
1658 1662 
1684 1686 
1740-1741 
1781-1782 



1334-1335 
1359-1360 
1397 1413 
1468 
1498 
1553 
1636 
1670 
1700 
1760- 



196-197 
258 261 



311 
398 
437 



316 
410 
446 
493 



481-482 

526 547 552-553 
575-576 581-582 
605 612 615-617 620 
649-650 656 660 
712 719-721 
773 780 



814 
866 
926 
977 



723 
784 
826 
869 
929 
981 



5-8 10 26 30-31 47 
69 82 84-85 97 102 
139 150 152 154 
176-177 192 194 
227-228 247 255 
288-289 301 307 
349 370-372 392 
427 430-431 433 
469 473 477 
503 513 522 
564 572-573 
599-602 
637 647 
674-675 
738 744 746 
802 804 809 811 
843 845 848 859 
913 916 919 921 
960 963 971 975 
993 1007 1016 1029-1030 
1035 1038-1039 1045 
1064 1070 1072-1073 
1097 1099-1102 1104 
1141 1149 1161-1162 
1209 1222 1227 1229 
1238-1239 1243 
1289 1291-1293 
1320 1330 1332 
1373-1374 1379 
1409 1423-1424 
1443 1459 1484 
1496-1497 1501 
1527 1530-1531 
1549 1563 1565 
1577 1586 1591 
1628 1630-1632 
1649 1661-1662 
1675' 1684 1690 
1717 1724 1730 
1767 1779 
686 1352 1412 



50-51 57 68- 
113 119 137 
156 163 169 174 



212-215 
282 285 



330 
415 
454 
499 



334 
426- 
461 
502- 
563- 
585 
631 
670 
731 



1253 
1307 
1338 
1389 
1430 
1486 
1505 
1533 
1567 
1599 
1636 



665 
728 

78J3-789 
831 837 
877 905 
937 950 
990 992 
1034- 
1059-1060 
1087 1089 
1113 
1208- 
1235 
1287- 
1317- 
1369 



1108 
1175 
1231 
1285 
1311 
1345 

1399 1 1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 1571 
1602 1625 
1639 1642 
1666-1667 1670 
1699 1705 1712 
1737-1738 1752 



1411-1412 
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Tissue Origin 



Genomic DNA 
from BAC 39316 



adult bladder 



bone 



RNA Source 



Research 
Genetics 
(CITB BAC 
Library) 



Invitrogen 



Hyseq 
Library Name 



BAC003 



BLD001 



SEQ ID NOS 



1352 



5-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 363 382 



413 415 424 430 
543 562 564 607 
652 667 671 710 
773 786 788 837 
909 918 929 966 
1025 1055' 1073 
1185 1189 1199 
1536 1560 1573 
1637 1649-1650 
1669 1671 1690 
1732 1739 1741 



443 483 502 542- 
616-617 626 635 
727 755-756 762 
840 866 893 898 
977 983 1016 
1082 1140 1167 
1270 1369 
1596 1614 



1654-1655 
1719 1727 
1760-1761 



1481 
1636- 
1658 
1731- 
1779 



marrow 



Clontech 



BMD001 



3-8 11 13 18 29-31 33 35-35 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 
187 192-193 197-198 
213 215 217 219 222 
235-237 242-244 255 
264 266 273 276 278 

307 312-313 
352 357-358 
387 389 394 



283 



295 
333 
382 
412 
437 439 
461-462 
482 485 
513 516 
540 542 
569-577 
603-604 
632-633 
660 666 
701 708 
740-742 
773 
796 



301-302 
339 343 
384-385 
416 421 



233 
263- 
290 
330 



424-427 
441-442 445 
471-472 475 
488 493 498 
519 523-524 
544-545 549 
581 583-586 
608-609 
636-637 



172 178-180 
203-205 210- 
224-226 
258 260 
286 
321 

370-371 
408 410 
429-431 436- 
447 454-456 
477-479 481- 
500 503-506 
526 530 535- 
555 565 567 
588 593 601 
613-619 621-622 
642 649-650 656- 



775-778 
798 802 
832-833 
858-859 
890-892 
922-924 
952-953 
981 985 



670 672 674-675 
716 718-720 731 
744-745 752 761 
780 785-786 



679 683 
735-736 
765 772- 
789-791 



830 
855 
883 
914 
941 
976 

1002 1005-1007 
1028-1031 1033 
1042 1044 
1059 1061 
1079 1106 
1124 1126 
1145 1163 
1200 1202 
1228 1240 
1270 1278 
1291 1293 



810-812 
837-838 
866-867 
896 903 



823-824 826 
843-844 848- 
869 878-880 
905 908 912- 



1317-1320 
1346 1349 
1369 
1400 
1419 
1433 



927 930-931 
955-958 963 
987 990 
1013 
1035 
1047 1050 
1063 1066 
1110-1113 
1134-1135 
1172 1178 
1216-1217 
1246 1254 
1281 1285 
1299-1301 
1327 1331 
1353 1356 
1372-1374 1379-1380 
1403 1406 1408 1413 
1423 1425-1427 1430-1431 
1439 1443 1446-1449 1459 



937 939- 
969 973 
992 995 1000 
1016 1025 
1037 1039 
1053-1054 
1070-1071 
1115-1117 
1142 1144- 
1199- 
1227- 
1266 
1290- 
1314 
1343 
1367 
1394 
1417 



1197 
1224 
1261 
1287 
1308 
1339 
1361 



1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RNA Source 



bone marrow 



Clontech 



Hyseq 
Library Name 



BMD002 



bone marrow 



Clontech 



bone marrow 



Clontech 



SEQ ID NOS: 





1509 


1513 


1521- 


-1522 


1524 


1526 

-1- —* mm> W 


1528 


1531 


1536- 


-1537 


1543 


J. Zf ^ D 


1548- 


-1549 


1552 


1554- 


-1555 


1557- 


1559 


1571- 


1572 


1581 


1589- 


1592 


1597- 


-1600 


1609 


1614 


1621 


1626- 


1628 


1630- 


1632 


1634 


1636 


1638- 


•1639 


1641 


1646 


-1647 


1651 


1653- 


1655 


1661- 


•1662 


1676 


-1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713- 


■1714 


1717 


1720 


1722 


-1723 


1727 


1737 


-1738 


1740 


1758 


1767 


1772 


1781 


-1782 


1785 


-1786 





11 15-16 19 
83-84 93 99 
139 169-170 
212-213 219 
255 259 264 
292 295 301 
316 324 326 
353 357 360 
397 403-404 



429-430 
465-466 
520 523 
569-570 



433-436 
472 475 
525 531 
581 583 



30-31 35-36 
103 108-109 
174 177 180 
222 225-226 
273-274 284 
303-304 307 
330 334-335 
370-373 384 
414-416 421 
440 444 
478 491 
545 548 



68-69 75 
118 137 
190 193 
232 237 
286 290- 
312-313 
348 352- 
386-387 
425-427 
451 454 
493 516 
552 566 
597-598 
652 656 
710 718- 
761 765 
830 834- 
871 878- 



590-591 

601 616-617 621 641 650 
659 671 674-675 679 684 
719 728 734 737-738 742 
774-778 790 811 814 818 
836 854-855 859 866 869 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
i042 1048 1051 1054-1055 1058 
1106 1112-1114 1155 
1200 1223 1227-1228 
1260-1261 1282-1283 
1295 1314 1317-1321 
1333 1341 1343 
1355-1357 
1377 1379 
1397 



1088-1089 
1157 1192 
1236-1237 
1285 1287 
1324-1327 
1347 1350 
1369-1370 
1383-1384 
1413 1417 



1330 
1353 
1373 
1394 

1425-1427 
1446 1459-1460 1470 
1521 1536 1546-1549 
1574- 1578 1598-1600 
1631 1634 1646 1649 
1658 1669-1670 
1688 1690-1693 
1704 1707-1709 
1723 1725 1727 1729 



1400 
1438 
1493 
1560 
1621 
1653 
1683-1684 
1696 1699 
1711 1720 



1367 
1381 
1406 
1442 
1505 
1573- 
1626 
1656 
1687- 
1702 
1722- 



BMD004 



BMD007 



1738-1740 
1760-1761 
1786 

73-74 503 



1743-1746 
1767 1777 



1731-1733 
1752 1755 
1781-1782 



922 1036 1711 



95-96 666 1320 
17 56-58 103 110 
179 185 188-189 
218-221 225-226 
288 310 312 320 
408 420 455 
590-591 615 
684 697 710 
788 826-827 
866 872 898 



1475 

144 150 171 
204-206 210 
237 251 277 
359 386 388 
485 503 510- 
647-648 665 
725-726 743 780 
848-850 854-855 
918 921-923 953 



adult colon 



Invitrogen 



CLN001 



394 
512 
672 
786 



117 
201 
231 
333 
481 
635 



858 

976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



Mixture of 16 
tissues - 
mRNAs 



Mixture of 16 
tissues - 
mRNAs* 
adult cervix 



RNA Source 



Various 
Vendors 



Various 
Vendors 



BioChain 



Hyseq 
Library Name 



CTL016 



CTL021 



CVX001 



SEQ ID NOS: 



1462-1464 1512 1556 15S3 1587 
1594 1596 1614 1625-1626 1631 
1639 1645 1650 1675-1677 1687- 
1688 1701 1713-1714 1724 1740 
1765 



401 1490 1686 



312 7B2 1132-1133 1403 1712 1715 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
163 170 179 181 186 192 195- 
198 201-202 218-219 222 229- 
257 266 276-277 285-286 288 
304 307 312-314 324 
335 342 352 358 
379 381-382 384 
414 416 



156 
196 
231 
298 
326 
362 
388 



332 
376 
410 



301-302 
329-330 
371-372 
398 400 
426-427 430-431 
448 461-462 454 
483 491 493 
516-517 526 
547 557 S61 
582 585-586 
602 604-605 
623 644 650 



496 
530 



433-436 
471-477 
503 506 
535 



419-420 
439 446 
479 482- 
510-513 
542-544 546- 
572-573 575-577 581- 
588-589 593-594 600 
607-609 612 615-619 
654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-933 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 
1033 1036 1038 1045 1047 
1066-1067 1071 1073 
1082 1098 1113 1124 
1139 1146-1149 1163 
1173 1175 1177 1181 
1202 1211 1214 1216 
1225 1227 1232-1234 
1243 1258 1264-1265 
1279 1287-1290 1308 
1316 1320 1323 1327 



1056 
1079 
1134 
1170 
1200 
1222 
1241 
1270 
1311 
1349 



1027 

1053- 

1075 

1129 

1167 

1197 

1221- 

1240- 

1268 

1310- 

1345 



1353-1354 



1383-1384 1386 



1360 1372-1374 
1394 1397 1405- 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Invitrogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conception^ umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1406 
1437 
1466 
1503 
1531 
1585 
1609 
1626 
1649 
1674 
1702 
1724 
1741 
1760 
1786 



1416 
1442 
1472 
1506 
1533 
1589 
1614- 
-1628 
1653 
-1675 
1709 
1729 
1743 
•1762 



1425- 
1446 
1478 
1512 
1541 
1597- 
1516 
1630 
1656 
1683 
•1710 
1731- 
■1744 
1767 



1427 
1448 
1482 
1522 
1547 
1598 
1620 
1638 
1662 
1685 
1715 
1732 
1748 
1773 



1431 
1453 
1496 
1527- 
1569 
1600 
1623- 
1641 
1667 
-1688 
1717 
1735 
-1749 
1778 



1436- 
1459 
1501- 
1528 
1571 
1608- 
1624 
1643 
1669 
1699 
1722 
■1739 
1755 
1785- 



diaphragm 



BioChain 



DIA002 



endothelial 
cells 



Strategene 



EDT001 



137 282 289 730 
1478 1599 1614 



780 986 1409 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 
161-163 166-172 176-179 
192 194 196-201 204-207 
214 220 224 229-230 233 
240-241 251-252 258 
267-269 272 276-277 
285 288 290 295-296 
311 313 316 321 325 
335 340-342 351-355 
380-382 384 387 390 
407-408 410 412 414 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 



152-153 
187 190 
210 212- 
235-236 
261-262 265 
279-281 284- 
301-302 310- 
329 331-333 
360 371 375 
392 397 400 
416 425-427 



572-576 579 581 585-586 
595 597 599 603 607-612 
620 622 626 630 632-634 
644 647 656-660 662-664 
678 680-682 692-697 707 
712-713 719 730 732 734 
743-746 751 759 768 771 
778- 783 786-789 793 800 
807 810-811 814 816-81B 
824 826 828-829 832 
845 848-850 854-860 
871 874 876-879 883 
891 894-895 893-900 
913 916 919-922 924 
935 939 943 948-949 
959-961 964 969-970 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 

1134-1135 1138 
1148-1149 1153 
1171 1183-1184 
1205-1207 1211 
1221 1225 1229 
1238-1241 1243-1244 
1253 1257-1258 1261 



589 593 
615-617 
638-641 
670 673 
709-710 
736 738 
773 775- 
803 805- 
821-822 
834-838 842- 
862 864 869 
885 887 890- 
903 908 910« 
926-928 930 
951-954 957 
973 '975-978 



1126 1128-1131 
1140 1144-1145 
1157 1160 1163 
1198-1199 
1216-1217 
1232-1235 
1246 1250 



1202 
1219 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1 7rc_ 




1268 


1270 


— XZ /X 


1 974 


17 77 




1283 


1285 


w i one 




12 90 


1293 


1295 


1298 


1 30R 


1317 
1J x^ 


13 17- 

XJ X f 


1320 


1324- 


1325 


1397 


1 3 ?Q_ 


X wJ J V 


1334 - 

X J w "S 


1335 


1338 


1 347 - 


1 **4"* 
ijtJ 


x j 1 3 


1 347 


1350 


1355 




1 3 f^Q 


1^67 
X J O / 




13 74 


1376 


XJ / J 


1 lap 


X** U U 


Xfr U D 


1408 


1414 


1417 
Xt x / 


1 41 Q 


1 4 9 4 - 


1 47fi 
X *4 O 


1428-1431 


1 434 - 

X*4 jfi — 


1 43 P 
X f* J o 


14 4 0- 

X*± fi U — 


1 44*5 


1448 


1450 


1 4C7 - 




1 4 C A 
X *X D O 


1 479 
li 


1474 


1478 


1 4fl7- 
Ivo / 


l^i o o 


1 4 Q 1 - 


Xft J J 


1501- 


■1504 




13U7 


1 CI 1 
X3 X X 


IjIo 


1520- 


1521 


1 coir 


1 CO Q 


1 en 
XjjX 


13J b ~ 


1537 


1539 


-XOflU 


lD4b - 


-LOft / 




1552 


1555 


133 / - 


133? 


13 Ol w 


ISO 3 


1568 


1571 


13 /3 


13 / O — 


1 CTQ 


X3BX- 


1583 


1587 


- 13O0 


13?U 


1 CQO 
IOjZ 


isy / 


1605- 


1606 


loll 


i f i i 
lol J 


1615 


1618- 


1621 


1624 


-1628 


1630- 


1631 


1634 


1636 


1638 


1641 


1643- 


1650 


1652- 


1659 


1664 


1666- 


1667 


1669 


1671 


1675- 


1681 


1683- 


1688 


1696- 


1698 


1703 


1711 


1715- 


1716 


1719 


1722- 


1723 


1726 


1731- 


1733 


1736 


1739- 


1741 


1743 


-1744 


1749 


1755 


1760- 


1761 


1765 


1767- 


1768 


1771- 


1773 


1776 


1779 


1783- 


1786 



Genomic DNA 
from 
Genetic 
Research 
BioChain 



Genomic clones 
from the short 
arm of 
chromosome 8 



EPM001 



286 686 1297 1303-1304 1352 
1411-1412 1754 



esophagus 
fetal brain 



ESO002 



131-132 261 289 380 
1000 1007 1397 



503 860 892 



Clontech 



FBR001 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



fetal brain 



Clontech 



FBR006 



5-9 25 43 60 62-63 65-66 70 
80 87 92 101 103 108 114 13 
149 152-153 157 168 171-172 



207-208 210 
238 251-253 
301-302 307 
330 333-334 
357 370 373 
391-392 397 
411 417 421 
437 440-443 
476 483 488-489 
513 516 519-520 
544 547 550 561 
590-591 595 
623 628-629 
657-658 660 
689 691-694 
710 716 720 
744 757-7S0 
806-807 810 
858 861 864 
894-895 898 
936 938 945 
959 961 963 



212-213 221-226 
265 272 279-281 
310 317-318 321 
336-338 346-347 
377 379-380 382 
399 
424 
454 



597 
631 
665 



402 406-408 
426-427 430 
460 464 467 
497 508 
530 537 
572-574 
607-609 
638-640 
674-675 
699 701 
734 736 
780 
839 



495 
524 
567 
604 
634 
669 
696-697 
728 732 
763 775-778 
817-818 826 



871-872 884 890 
904 915 921-923 
950 952 955-956 
967 969-971 990 



72 

6 139 
175 
237- 
295 

-324 
352 
384 
410- 
436- 
473 
510- 

-540 
582 
615 
655 
679 
706 
742- 
799 
843 

-891 
935- 
958- 
992 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



999 1 
1016 
1035 
1065 
1114- 
1151 
1172- 
1190- 
1226- 
1253- 
1270- 
1314 
1339 
1371 
1386 
1425- 
1440- 
1502- 
1519 
1559 
1611- 
1640 
1693 
1718 
1730- 
1742 
1767 
1786 



001 1005-1006 1008 1 
1022 1024 1029-1030 
1042 1047-1048 1052 
1067 1070 1082 



1115 1119 
1153-1156 
1173 1178 
1200 1211 
1227 1229 
1255 1258 
1273 1281 
1317-1320 
1341 1344 
1373 
1392 
1426 
1441 
1503 
1536 
1573 
1614 
1651 
1696 
1720 
1733 
1745 



1089 
1131 1143- 
1160 1163 
1184 1186 
1216 1222- 
1231 1236 
1260 1262 
1287 1308- 
1326 1334- 
1350 1356 
1376 1379 1381- 
1396-1398 1419 
1428-1429 1432 
1448 1466 1470 
1507 1511 1513 
1544 1549-1550 
1589-1590 1598 
1619 1621 1625- 
1657-1658 1676- 
1703-1704 1713- 
1722 1724 1726 
1735-1736 1738- 



1755 
1771-1772 



1759-1761 
1777 1779 



013 
1032 
1056 
1109 
1149 
1167 
1188 
1223 
1245 
1266 
1309 
1335 
1369- 
1382 
1423 
1437 
1482 
1516 
1557- 
1608 
1626 
1679 
1714 
1728 
•1739 
1765 
•1780 



fetal brain 



Clontech 



FBRs03 



235-236 520 864 1068 1188 1587 



fetal brain 



Invitrogen 



FBT002 



fetal heart 



fetal kidney 



Invitrogen 



FHR001 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 
562 572-573 590-591 
632 647-648 650 655 
682 690-691 700-701 
746 782 784 788-789 
829 840-841 847 
904 919 
954 



897-900 
948-949 



595 597 
669-670 
710 717 
814-815 
854-855 857-858 
925 935-937 946 



561- 
623 
672 
736 
825 



960-962 966 969-970 
986 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 
1090 1109 1115 1118 
1136-1137 1144-1145 
1157 1193-1195 1198 
1220 1222 1234 1257 

12B0 1285-1286 
1317-1320 1330 



1274-1275 
1312 1314 
1344-1345 
1358 1364 



1431 1435 
1536 1547 
1582 1587 
1615 1619-1621 
1665-1666 1673 
1715 1723 1728 
1759-1761 1765 
1778 1781-1782 



1349-1350 
1369 1379 
1507 
1564 
1595 
1638 



1476 
1554 
1593 



1082 1085 
1120*1128 
1149 1156- 
1204-1205 
1262 1271 
1294 
1342 
1355-1356 
1383-1384 
1519 1532 



1567 
1601 
1644 



1687-1688 
1749 1753 
1771 1774 
1786 



1578 
1608 
1661 
1690 
1757 
1776 



105 124 180 289 864 1036 
1229 1614 1616 1762 1785 



114B 



Clontech 



FKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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^Tissue Origin" 



fetal kidney 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



258 277 280-281 307 310 314 330 
371 387 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 



1371 1376 
1440-1441 
1618 1631 
1678-1679 



1391 1422 1425-1426 
1470 1543 1598 1601 
1651 1654-1655 1669 
1691-1692 1733 1785 



fetal kidney 



Clontech 
Invitrogen 



FKD002 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



fetal lung 



Clontech 



FLG001 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



fetal lung 



Invitrogen 



FLG003 



83 88-89 
196 224 
291-292 
373 376 



fetal lung 



9 15-16 29 41 47 68-69 
102 124 137 152-153 165 
229 231 249 254 256 267 
300 325 333 344-345 352 
379 384 408 426-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1224 1258 1290 1309 
1347 1355 1369 1381 
1431 1438 1449 1491 
1547 1557-1560 1567 
1636 1644 
1671 1675 



1216 
1342 
1414 
1536 
1601 
1667 
1739 



1653-1655 
1680-1681 



1320 

1413- 

1512 

1590 

1662 

1706 



1760-1761 1769 



Clontech 



FLG004 



103 276 334 
1614 1658 



465-466 737 843 1131 



fetal liver- 
spleen 



Columbia 
University 



FLS001 



3-11 13 15- 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 344 
358 360 362 
386-387 390 
406 408 410 
437 439-442 
456 459 461 
487-488 490 
506 509-513 
529 531 534 
553-554 561 
576 579 581 



21 25 30 
60-66 6 
85 87 8 
-124 126 
144 147 
167-172 
-190 193 
210-214 
-244 246 
261-265 
280-281 
-301 304 
318 320 
-345 350 
370-374 
392-393 
-412 415 
444-445 
-470 472 
-491 493 
515-520 
536-540 
-562 564 
583 585 



-39 41-4 
8-69 72 
9 92-103 
-127 130 
-149 152 
174 176 
-194 196 
219 221 
-247 250 
268-269 
284-286 
306-307 
-321 326 
352-353 
376 378 
400-401 
417 419 
448 452 
-479 481 
500-501 
522-524 
542 547 
567-568 
-597 599 



8 50- 
75 
105- 
133 
-153 
-178 

198- 
-231 
-251 
272 
288 
309 
329- 
356- 
-384 
403 
422- 
-454 
-483 
503- 
526- 
-549 

571- 
-605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal liver- 
spleen 



Columbia 
Universi ty 



FLS002 



607 610-613 615-621 
628-634 636-640 644 
660 665 669-670 672 
681-682 684 690-695 
710 713-714 716-719 
731 734 736 
748 750-751 
777 779 783-788 793 
805 808 810-812 814 
B24 826-832 834-837 
867 869-876 878-883 
897-89B 902 904-914 
928 930-937 939 945-950 
960-961 963-965 967 969 



623-624 626 
647-650 655- 
674-675 678 
697 702 708- 
725-728 730- 
738 740-741 743-746 
759-766 768 772 7C74- 
796 798 800- 
818-819 821- 
843-847 849- 
887 889-895 
916 919 921- 
953-958 
971 974- 



978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 

1099-1103 1107- 
1121-1123 1125 
1131-1134 1136-1137 
1153 1159-1160 1163 



1061-1064 
1076 1078 
1089-1090 1097 
1113 1115-1119 
1127-1128 
1144-1150 



1170 1175 1177-1178 
1192 1195-1200 1202 
1211 1214 1216 1218 
1225 1227 1234 1237 



1188 1190- 
1206 1208- 
1221-1222 
1241 1244 
1258 1261 
1277-1282 



1246-1247 1251 1254 
1266 1268 1270-1273 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-147B 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 
1645-1651 1653-1562 
1669 1671 1673-1674 
1690 1696 1701-1703 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 

3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 



1634-1639 
1664 1667- 
1676-1688 
1706-1709 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 








SBQ ID NOS: 













206 


212- 


215 219-221 


223 


225- 


A 

229 








231- 


•232 


240-244 246- 


247 


250- 


251 








258- 


•259 


262 264 268- 


269 


ft ft ft 

272 


ft TC 

275 








277 


280- 


281 284 286 


288 


ft r\ ft 

290- 


292 








295 


ft n a 

298- 


•299 301-304 


306 


*y r\ a 

308- 


3 10 








318 


ft ft n 

320- 


►321 323 325 


329 


J Jl 


O *5 A 

334 








342 


ft A O 

34 8- 


•349 352-353 


356 


359 


-j /- a 
368 








371 


ft A 


376-379 381- 


384 


ft O £ 

386- 


■387 




• 




392- 


•393 


397-398 400- 


401 


A ft 1 

403 


■ in 

410- 








413 


421 


423 426-427 


429- 


A ft ft 

-430 


A 1 ft 

433- 








436 


438 


440 443 445 


448 


451- 


A tT" ft 

•452 








454- 


-455 


460-463 465- 


467 


469 


A T t 

471- 








473 


475- 


-476 478-479 


481- 


-483 


487 






• 


490- 


-491 


493-494 497 


500- 


-501 


503- 








505 


509- 


-513 515-517 


519- 


r~ ft ft 

-520 


r- ft a 

524 








526- 


-531 


534 537-542 


544 


547 


r* r~ ft 

552- 








554 


556 


558 561-562 


564- 


-SSI 


571- 








577 


583- 


-587 590-591 


593 


i— t*\ r~ 

595 


f* 

597 








601 


604- 


-606 608-613 


616- 


-617 


619- 








624 


626- 


-632 634 637- 


•642 


644 


647 








649- 


-652 


654-659 662- 


•665 


r f ft 

669- 


/** 't ft 
-672 








674- 


-675 


661-682 685 


688 


690 


696 








698 


700- 


-703 707 709- 


•710 


713 


717 








719-721 


723-724 728 


731 


-732 


734 








737 


-738 


742-745 748 


752 


754 


ft 

759 








763 


-766 


768 770 773- 


-777 


*t n f\ 

780 


t rt ft 

782 








784 


786 


791 795-798 


801 


-802 


805 








808 


811- 


-812 818 823- 


-824 


826 


q ft ft 
-827 








832 


834 


-837 839 843 


846 


O A Q 

848 


o c c 
-856 








858 


-861 


865 867 869 


871 


873 


-874 








876 


878 


881-882 887 


889 


ft ft ft 
892 


ft ft A 

894- 


- 






898 


90l« 


-902 904 906- 


-908 


913 


-915 








919 


921 


-924 926-932 


934 


-935 


937 


• 






939 


-941 


943 946-947 


950 


953 


958 


* 






961 


965 


-967 971 973- 


-975 


977 


-979 








981 


984 


-985 990 992- 


-993 


995 


-997 








999 


1001 1004-1007 1009 


-1011 








1013 1016 


1020 1023 


1025 1027- 




■ 




103 


1 1033- 


-1035 1039 


-1042 1044- 








1045 1049 


1053 1055 


-1056 1058- 




\ 




1059 1062 


1064-1065 


1067-1070 








1072-1074 


1079 1082 


1087 1089 


• 






1093 1097 


1099-1103 


1105-1107 








1109-1114 


1123 1125 


-1127 1132- 








1134 1140 


1143-1145 


114 


8-1150 








1156 1158 


1160 1163 


1172-1173 








1177-1178 


1181-1184 


119 


0-1192 








1195-1197 


1199 1204 


120 


6 1208 








121 


1 1214 


1216 1219 


1227 1230 








123 


4-1235 


1237 1240 


-124 


1 1243 








124 


5 1247 


1256 1258 


126 


0-1261 








1264 1268 


1270-1271 


1275 1278- 








127 


9 12 


84 


-1286 1288 


-128 


9 1299- 








1301 1306 


1308 1312 


1314 1317- 








1319 1323 


-1325 1327 


-133 


0 1334- 








133 


5 1339 


1343-1347 


1349-13 


50 








1354-1355 


1357 1360 


1362-1363 








1365-13 


67 


1369 1372 


1376 1378- 








1380 13 


86 


1389-1391 


1394 14 


00 








1403 14 


06 


1409 1416 


-1419 1422- 








1427 1429 


1435 1437 


-143 


8 1440- 


_ 






144 


2 1446 


1448-1450 


1453 1460- 








146 


1 14 


68 


1470 1472 


1474-1475 








147 


8 14 


82 


1486 1490 


-1493 1496 








1498 1500 


-1504 1506 


1508-1509 








151 


1-1512 


1516 1518 


-1519 1521 








1524-1528 


1531 1536 


-153 


8 1543 








1547 1550 


1554 1556 


1564 1567- 








1569 1580 


1587-1588 


1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ IP NOS: 



1597- 

1618- 

1641 

1661 

1676 

1691 

1713 

1727 

1744 

1763 

1776 



1598 
1628 
1646- 

•1662 
1679 

•1692 

•1714 
1730 
1748 

•1764 
1779 



1500- 
1630- 
1649 
1664 
1683- 
1699 
1717 
1733 
■1752 
1767 
1783 



1601 
•1631 
1652 
1667- 
1684 
1702 
1719 
1738 
1758 
1769 
-1786 



1611- 

1635- 

1654- 

•1669 

1686- 

1707 

1722 

1740 

1760 

1772 



1612 
1638 
1659 
1674 
1688 
1711 
1726- 
1743 
1761 
-1773 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



fetal liver 



Invxtrogen 



FLV001 



fetal liver 



fetal liver 



fetal muscle 



Clontech 



Clontech 



Invitrogen 



FLV002 



FLV004 



FMS001 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 1B0 194-196 198 200 
210-211 220 225-226 
239 247 259 
303 310 
356 371 
412 414 



204-206 
235-236 
277 280-281 
321 329 344 
382 395 408 
435 441-442 
506 509 522 
567 569-570 
658 667 669 



267 
317 
376 
429 
494 



725-726 
786 809 
873 875 
916 954 
989 993 
1008 
1070 



732 

817 829 
881 889 
963 967 
995 997 
1014-1015 
1086-1087 



261 
313 
374 
419 

465-466 490 
527 534 552-553 
572-574 607 631 
672 685-686 702 
748 759 761 778 
837 857 861 
894-895 909 
974 977 986 
1000 1005-1006 
1020 1042-1043 
1089-1090 1118- 



230 
272 
320- 
379- 
434- 
504- 
562 
657- 
717 
784 
872- 
911 
988- 



1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 



1426 
1464 
1539 
1583 
1644 
1738 
1779 



1429 1431 
1469-1470 
1549-1550 
1598 1601 
1649 1666 



1442 1448 
1489 1528 
1557-1562 
1611 1615 
1674 1706 



1746 1763-1765 1774 



1425- 
1463- 
1536 
1577 
1622 
1721 
1776 



676 998 1719 



93 133 214 301 355 3 
581 601 679 837 847 
1236 1270 1313 1324- 
1355 1367 1425-1426 
1733 1760-1761 



74 379 555 
859 1123 
1325 1327 
1536 1690 



26 37-39 
113 128 



50-51 58 84 
131-132 139 
201 206 211 
282 286 302 
383 398 412- 
452 462-463 
561 569-570 
626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1060 1064 1070 



194 
261 
376 
436 
519 
607 



198 
276 
379 
448 
529 
623 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 
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Tissue Origin 



fetal muscle 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099 

1173 

1266 

1324 

1383 

1433 

1557- 

1632 

1712 

1766 



■1102 
1198 
1270 
■1325 
1384 
1505 
1559 
1644 
1725^ 



1116- 

1208 

1277 

1329 

1399- 

1514 

15S2 

1650 

1726 



1117 
1228 
1298 
1336 
1400 
1542 
1589 
1652 
1743- 



1121 
1240 
1317 
1337 
1403 
1551 
1599 
1671 
1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invitrogen 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 



151-153 156 163 170 176 
189 197-198 200 202-203 
231 246-247 261 263 
285-286 290 293 299 
321 325 328 



222 
277 
311 
341 
362 
388 



180 188- 
210 218 
265-270 
301 307 



345 351-352 
368 370 372 
394 404-405 
419-420 424 426-427 
445 448-449 454 462 
476 490 493 504 506 
519 526 531 537-540 
561 567 572-573 581 
612 615 623 630-631 
651 657-658 660 
672 676 678 681 
709-710 713 717 
728-729 732 748 
766 770 775-777 
789 798 809 811 
824-826 831 842 
864 881 894-895 
918 922-923 928 
946 948-949 953 



330 333-335 339 
355-356 358-359 
376 379-382 384 
408-409 411-412 
436 441-442 
465-466 472 



970 975 977 986 
1000 1004 1007 
1027 1032 
1057-1058 
1072 1077 
1103 1108 
1131 1134 
1153 1156 
1189 1192 
1205 1208 
1220 1222 



1266-1267 
1285 1299 



509 515-517 
547 549 560- 
584 589 611- 
635 647 649 
662-665 667 669 
688 701 704-705 
720-721 725-726 
750 753 759 764 
780-781 786 788- 
814 816-817 822 
857 859 861 863- 
908 910-911 916 
932-933 935 937 
960-961 966-967 
990 992-993 999- 
1013 1018 1025 
1035 1041-1043 
1060 1062-1064 
1090-1091 1097 
1113 1119 1123 
1140 1148-1149 
1163 1167 1178 
1198 
1216 
1243 
1280 



1195-1196 
1211-1212 
1225 1240 
1274 1277 
1310 



1325 1329-1330 



1317-1322 
1342 1344 



1054 
1069 
1099- 
1128 
1152- 
1182 
1201- 
1219- 
1258 
1282- 
1324- 
1346 



1349-1351 
1369 1371 
1383-1384 
1410 1427 
1439-1441 
1468 1470 
1487 
1512 



1354-1357 1365-1366 
1373 1376 1378 1380 
1387 1399-1400 1405 
1429 1431 1433-1435 



1448-1449 
1472 1475 
1490-1491 1493 
1521 1525-1526 



1454 1457 
1480-1481 
1498 1509 



1529 



1536 1547 
1592 1595 
1604 1608 



1549 1557-1559 
1597-1598 1601 
1611 1614 1618 



1535- 
1588 
1603- 
1624- 
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Tissue Origin 1 RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1626 
1644 
1665 
1702 
1724 
1742 
1765 
1786 



1632 
1646 
1668 
■1703 
1727 
1747 
1772 



1634 
1654 
1675 
1709 
1731 
1749 
1776 



1636 
1657 
1685 
•1710 
1732 
1755 
•1777 



1641 

1660- 

1687 

1716 

1737 

1760 

1779 



1643- 
1662 
1689 
1719 
•1740 
•1761 
-1780 



fetal skin 



Invxtrogen 



FSK002 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 425-427 433 4.36 450 454 
515 544 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 
1333-1335 1343 1347 1350 1369- 
I37i 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 1755 



fetal spleen j BioChain 



FSP001 



110 137 211 353 
1639 1771 



589 927 1108 



umbilical cord BioChain 



FUC001 



fetal brain 



GIBCO 



HFB001 



4-8 10 12 14 
64 68-69 75 
114 116 119 
154 157 161 
184 186 192 
215 230 234 
267 271-272 
314 317 321 
356 368 
392 394 
420 424 
454 459 
486 488 
537-540 
591 593 
645-647 
668 674-675 
703-705 709 
727 732 
777 780 
814-817 
861 864 
900 903 
933 936 
984 990 
1007 1016 
1047 1059 
1077 1089 
1115 1134 
1156 1163 
1208 1216 



17 33-36 44-46 57 
82 85 101 104 113- 
122-124 133 137 153- 
163 166-167 175 181- 
197-198 200-202 212- 
246-247 251 256 263 
280-281 284 295 301 
326 333-335 345 351 
371-373 379-380 386 390 
406 408-410 412 414 416 
430-436 438 444-446 
463 467 473 482-483 
495 504 509 524 526 
555 561 574-577 
615 620-621 632 
659-660 662-664 



427 
461 
490 
547 
606 
650 



687 
714 
762 
793 
843 
888 



1243-1244 
1287 1298 
1350 1357 
1381 1398 



684 
711 
749-750 
789-791 
822 833 
875 879 
906-907 
940 948 
992 998 
1023 

1061-1063 
1094-1097 
1144-1148 
1171 1197 
1218 
1246 
1316 
1359 
1400 



588- 
637 
667- 
701 
725- 
775- 



1424 1427-1428 
1442 1446 1454-1455 
1484-1485 1489 
1505 1513 1525 
1546 1565 1567 
1576 1578-1579 
1601 1608 1612 
1626 1636-1637 
1653 1656 1658 
1675 1682 1684 
1709-1710 1722 
1738 1740-1741 



696 698 
719-720 
765 771 
796 802-803 
845 848 858 
894-895 897- 
911-912 925 930- 
953 960 966 977 
1000-1001 1005- 
1025 1037 1046- 
1073 1076- 
1112-1113 
1151 1154 
1204-1205 
1234-1235 
1283 1286- 
1344 
1373 
1408 
1433 
1479 



1224 
1279 
1320 
1371 
1403 
1431 



1492-1493 
1527 1536 
1571 1573 
1591 1595 
1615 1621 
1647-1648 
1661-1662 
1686-1688 
1727 1729 
1760-1761 



1346 
1375 
1414 
1440- 
1482 
1504- 
1538 
1575- 
1600- 
1624 
1651 
1672 
1690 
1735- 
1768 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



72 75 77 80 
102 107 110 
123 126 
153-155 
181 186 
208 210 
235-238 
260-262 
286 289 
321-323 
349 352 
371-372 
390 400 
434-435 
455 



325 
354 
377 
408 
438 



82 85 90-91 94 100- 
112-116 118-119 122 
128 134 136-140 147-148 
157 161 165 169-172 175 
188-189 197-198 204-206 
215 222-223 225-226 230 
240-241 247 253 256-258 
267-269 276 279-281 284 
298 300-302 307 310 318 
330-331 339 341 346 
356-359 362 364-365 
379-380 382 384 387 
414-416 419 424 
441-443 449 451 
457-463 470 472-473 475 
478 482-483 486-488 490-491 
496 499-500 502-504 506-507 
512 516 519-520 522 525-526 
530 537-540 543-544 546-547 
567 569-570 572-582 585 588 
591 593 595 599 601 604 
611-612 614-620 622-624 
636 643 645-647 650-652 
665 667-668 670-672 
687 689 692-694 697 
717 721 727 729-732 
743-746 750-751 759 
772 775-777 784 789 
802-805 810-811 814 
826 830 834-837 839- 
858-860 862 864 869 
879 883 885-887 890- 
898-901 905 908-910 
922-923 925 927 930- 
948 952-960 963-964 
975 978-979 981 983 
992 995 997 999-1002 
1011-1013 1016 1018 



431 
453 
477 
493 
509 
529 
566 
590 



661 
681 
714 
738 
770 
799 
824 
856 
877 
895 
919 
938 
972 
990 
1009 



1023 1026 
1038 1041 
1059 1064 
1078-1079 
1094 1097 
1115 1121-1122 
1138 1140 1143 
1156-1157 1159 
1193-1194 1200 



1029-1031 1033 
1047 1050 1053 
1068 1070 1072 
1081-1082 1086 
1103 1107-1109 
1127 1134 
1148-1151 
1167 1170 
1202 1207 
1211 1216 1219-1220 1226 
1229 1232-1234 1240-1241 
1246 1249-1251 1253-1254 
1267-1268 1271 1276 1279 
1285-1289 1293-1294 1305 
1308 1312 1316 1320 1327 
1339 1341-1344 1346 1349 
1357 1359 1365-1366 1369 
1379 1386 1389 
1413-1414 1416 
1425-1427 1430 
1442 1445-1452 
1463-1464 1468 
1474 1477-1479 1489 1492 
1497-1498 1501-1503 1507 
1511-1513 1517 1520-1521 
1526 1531-1533 1535 1537 
1547 1554 1556-1559 1564 
1571 1584 1587 1589 1594 
1601 1611-1612 1614-1616 
1620 1625-1628 1630-1631 
1637-1638 1640-1643 1645 



1373-1375 
1398 1409 
1420-1421 
1437 1439 
1457 1459 



606-609 
630 632 
654 659 
676 678 
699 710 
734 736 
763 766 
791 796 
819-821 
850 854« 
871 876- 
891 893 
912-916 
933 935- 
967 969- 
986-987 

1005- 
-1019 
-1035 
1057 
-1073 
1089 
1113- 
-1135 
1153 
1175 
-1209 
-1227 
1243 
*1258 
1282 
1307- 
1338- 
1355- 
-1370 
1394 
-1417 
1433 
1454- 
1470 
1494 
1509 
1524- 
-1538 
-1567 
1599- 
1619- 
1634 
1648- 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 










Library Name 






















1649 


1651 1653-1655 


1657 


-1558 








1664 


-1665 1667 1669 ! 


L673 


1678- 








1679 


1683-1684 1686 


1693 


1701 








1704 


-1705 1709 1713- 


1714 


1717- 








1720 


1724 1727-1728 


1731 


-1733 








1737 


-173 


8 1743-1744 


1752 


1754- 








1755 


1757 1760-1761 


1765 


1772 








1779 


1785 










macrophage 


Invitrogen 


HMP001 


5-8 


110 


204-205 


503 


534 


678 


859 






878 


933 


988-989 


1379 


144 


8 1504 


infant brain 


Columbia 


IB2002 


10 12-13 


15-18 22-23 


25 


29 34 




University 




37-3 


9 43 


47 50-51 54 


-56 


58 60-63 




* 


65-66 68 


-69 72-74 80 


82- 


83 86 








88-92 97 


100 102-104 


106 


-108 


110 








112- 


113 


115-116 


118 


123 


12 8 


130 








134- 


136 


138-139 


143 


147- 


149 


151- 








152 


154- 


155 163 


165- 


167 


169 


172- 








175 


181- 


184 186 


193- 


196 


198 


201 


! 






203- 


205 


209-210 


214- 


215 


222 


224- 








226 


231- 


232 235- 


•236 


239 


246- 


247 








252 


257 


260 268- 


•269 


272 


276- 


277 








279- 


281 


286 288 


291- 


292 


295 


298 








300- 


301 


304 307 


310 


313 


321- 


323 








330- 


331 


333-334 


339 


346- 


347 


349 








352 


356- 


357 362 


371- 


372 


377 


379- 








380 


383- 


384 392 


397 


401 


406 


408 


| 






411 


413- 


•414 416 


418- 


419 


422 


428 








430- 


431 


434-435 


438 


443 


449 


453- 








454 


461 


464-466 


469- 


470 


472- 


•473 








475-476 


478 482 


-483 


487 


490 


492 








494 


497 


503 507 


-508 


510-513 


516 








519- 


-520 


524-526 


530-534 


536-540 








547 


550- 


-551 561 


563- 


•564 


566- 


-567 








572- 


-576 


579 581 


-582 


584- 


-587 


590- 








591 


593 


595-597 


607- 


•609 


611- 


-613 








616- 


-617 


620 622 


-624 


627 


631 


637 








641 


645- 


-647 650 


-655 


657- 


-658 


660- 








665 


667- 


-675 689 


691 


695 


697 


699 


I 






703 


707 


713-715 


717 


721 


728- 


-731 








733- 


-736 


739 743 


745 


751 


755 


759 


I 






763 


769 


-770 772 


778 


780 


-781 


785 








788 


-789 


793-794 


799 


803 


808 


811 






• 


814 


825 


-826 830 


834-836 


840 


-843 








845 


848 


-850 854 


-855 


860 


862 


864- 








865 


870 


872 875 


-876 


878 


886 


888 








890 


-891 


894-896 


898 


903 


-904 


916- 








917 


919 


922-925 


927-928 


930 


-932 








934 


-936 


938 941 


945-946 


948 


-950 








953 


-954 


959-962 


966 


-969 


977 


979 








981 


986 


-990 992 


997 


999 


-1000 








1004-1006 1014 


1016 


101 


8-1019 


i 






1024-1025 1033 


1036 


104 


7 1051- 








1052 1054-1055 


1057 


-1059 1063- 








1064 1068-1070 


1073 


108 


1-1082 








108 


5 1089 1108- 


1113 


111 


8-1120 








1123-1124 1130 


1132 


-113 


8 1140 








114 


9 1151 1153- 


1154 


1163-1170 








1172 11 


74-1175 


1183 


-1184 1188 


1 






1190 11 


93-1194 


1196 


-1197 1199 








1204 1208-1209 


1211 


121 


8-1222 








122 


6-12 


27 1229 


1231 


123 


4 1241 








124 


7 1249 1251 


1256 


1258 1261- 








1262 1269 1274 


1279 


1281 1283 








1285 1287-1289 


1294 


-1295 1305 








1307 1313-1314 


1316 


-132 


0 1329 








1332 1341-1342 


1345 


134 


9 1356 








1362-1363 1365- 


1366 


1368-13 


70 








1374 1381 1383- 


1384 


138 


8 1400 








1403 1406-1407 


1413 


1417 1420 
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Hyseq 
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SEQ ID NOS: 



infant brain 



Columbia 
University 



IB2003 



infant brain 



Columbia 
University 



infant brain 



IBM002 



1423 

1441 

1454- 

1468 

1483 

1499 

1522- 

1542 

1555 

1580 

1593 

1610 

1624 

1639- 

1654- 

1672- 

1693- 

1717- 

1733 

1755- 

1777- 



1429 

1443 

1455 

1470- 

1485 

1502- 

1523 

1546- 

1563 

15B3- 

1595 

1612 

1626- 

1640 

1655 

1673 

1695 

1720 

1735- 

1758 

1778 



•1431 
1447. 
1457 
1471 
1493 
1503 
1525 
1547 
1565- 
1586 
1598 
1614- 
1627 
1642 
1658- 
1675- 
1701- 
1723- 
1741 
1762 
1786 



1435 
-1449 
1459 
1475 
•1494 
1505- 
1528 
1549- 
1557 
1588 
1600- 
•1616 
1630- 
1644 
1659 
1681 
1702 
1724 
1743- 
1765 



-1436 
1451 
1463 
1479 
1496 
1507 
1531 
1550 
1569 
1590 
1601 
1619 
1633 
1647 
1664' 
1685- 
1704 
1726- 
1744 
1771 



1439- 
-1452 
-1465 
1482- 
1498- 
1509 
-1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
1665 
•1688 
1708 
•1728 
1752 
1774 



17-18 20-23 29 34 4 
78-80 88 100-101 10 
123 128 133 135-137 
159 166 169 
223 225-226 
276-281 286 
324 
352 



310 322 
349-350 
384 403 



174 194 
229 235 
290-292 
331 334 
357 371 
408-409 414 
472 476 478-479 490 
520 530 534 536-540 
576 585 587 590-591 
601 606 612 616-617 
650 652-653 661 665 
675 678 689 715 717 
734 759 775-777 780 
806-807 811 824 845 
875 882 889 894-895 
919 921-923 932 935 
954 962 977 979 997 
1005-1006 1009 1011 



1033 1037 
1114-1115 
1145 1149 
1170 1174 
1202 1206 
1229 1240-1241 
1288-1289 1305 
1344 1347 1350 
1366 1378-1379 
1421 1423 1431 
1446-1447 
1503 1507 
1559 1567 
1610-1612 
1647 1657-1658 
1683-1684 
1713-1714 
1765 1771 



1043 1055 
1120 1123 
1151-1153 
1193-1194 
1209 1220 
1251 
1314 
1356 
1388 
1436 
1459 
1536 
1587 
1631 
1673 



1457 
1509 
1572 
1615 



1701-1702 
1719 1757 
1778 



3 60 68-69 
7 110 112 118 
146 148 152 
198 203 215 
-236 247 260 
295- 300-301 
339 346-347 
376-377 382 
-415 453-455 
503 507 516 
551 563 572- 
593 595-596 
620 622-624 
670-671 674- 
727-728 730 
-781 785 796 
-846 864 869 
898 904 917 
-936 946 950 
999-1000 
1017 1024 
1057 1109 
1127 1144- 
1160 1167 
1196 1199 
-1221 1226 
1258 1284 
1327 1333 
-1357 1365- 
1400 1403 
1440-1441. 
1471 1499 
1546 1557- 
1595 1598 
1639 1644 
1678-1681 
1708-1709 
1760-1761 



101 113 139 152 260 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



Columbia 
University 



IBS001 



10 12 119 175 279-281 321 334 
371 446 551 563 623 652 667 669 



131 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 










Library Name 
























671-1 


672 


819 


949 


966 


1113 


113 


0 








1151 


118 


8 1193-1194 


1196 


122 


9 








1258 


1265 1271 1287 


1317 


-131 


9 








1324 


-1325 1342 1423 


1440 


-144 


1 








1448 


1471 1482 1 


525 


1532 


1546 








1562 


156 


9 1588 1591 


1610 


1618 








1647 


164 


9 1658 










lung, 


Strategene 


LFB001 


5-9 


17 20-21 


25 


68-69 82 


94 


105 


fibroblast 






153 


157 


197- 


198 


203 


207- 


208 


212- 








213 


223 


262 


266 


283 


302 


321 


326 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


535 








537- 


540 


542- 


544 


562 


565 


567 


586 








599- 


600 


607 


615 


630 


647 


662- 


664 








692- 


694 


712 


719 


745 


748 


775- 


777 








794- 


796 


810 


837 


843- 


847 


849 


854- 








856 


869 


876 


903 


934 


953 


955- 


956 








964 


975- 


976 


984 


1000 1005-1007 








1024 


-1025 1033 1039 


1053 


1064 








1070 


1072 10B2 1112- 


•1113 


113 


4 








1136 


-1138 1140 1195 


1223 


1232- 








1233 


1246 1279 1285 


1295 


1311 








1320 


1334-13 


.35 1343 


1427 


-1428 








1446 


1478 1482 1493 


1504 


152 


17 








1552 


1555 1567 1575 


1582 


1598 








1620 


1625 1632 1638 


1645 


1654- 








1655 


1662 1680-1681 


1684 


1686 








1690 


1696 1702 1711 


1733 


1741 








1760 


-1761 1778 1785 








lung tumor 


Invitrogen 


LGT002 


5-10 


18 


20-21 29 33- 


-36 40 43 52 






54-5 


5 61 65- 


•66 68-70 73- 


75 80 85 








88-89 93-94 


100 


103 


106- 


108 


112- 








113 


115- 


-116 


118- 


-119 


123- 


124 


126 








130- 


132 


135- 


-137 


139- 


-141 


143- 


-144 








147- 


148 


151- 


-153 


155-156 


159 


161 








164 


169 


171 


179- 


-180 


185 


190 


192 


• 






194 


196- 


-199 


203-208 


210 


212-214 








216- 


217 


219 


222 


233 


240- 


241 


244 








246 


251- 


-252 


255 


-256 


261- 


•262 


266. 








272 


276- 


-277 


279-281 


284 


286 


288 








290 


295 


298 


301- 


-302 


309- 


312 


317 








321 


329 


332 


341 


-342 


344-345 


348 








352 


358-360 


363 


368 


370- 


371 


376 








380- 


381 


384 


389 


-390 


398 


400 


409 








414 


423 


426- 


-427 


430 


432- 


436 


443- 








444 


450 


-451 


454 


462 


468 


472- 


-477 | 








480- 


483 


487 


-488 


490 


-491 


493 


496- 








498 


500 


503- 


-506 


509 


-512 


515 


-516 








519 


521 


-523 


526 


530 


534 


541 


544 








547 


554 


557 


564 


566 


-567 


572 


-576 








585- 


•586 


588 


-589 


595 


-596 


601 


607 








611- 


•612 


615 


619 


621 


623 


626 


630 








632- 


•633 


644 


647 


649 


651 


655 


-656 








660 


662 


-665 


667 


669 


672 


683 


-684 








696 


700 


706 


710 


713 


716 


718 


-719 








722- 


•723 


728 


734 


-739 


743 


750 


752 








763 


765 


-766 


773 


-778 


784- 


•785 


787- 








789 


791 


800 


802 


-803 


809- 


•812 


814 








824 


826 


828 


-829 


832 


838- 


■839 


841- 








845 


849 


-850 


852 


-855 


857- 


-861 


864 








866 


874 


878 


-880 


882 


887 


890 


-891 








897- 


•898 


902 


904 


906 


-907 


910 


916 








918-920 


922 


924 


-925 


927 


930 


-932 








934- 


-935 


937 


947 


950 


953 


955 


-956 








961 


963 


966 


-967 


969 


971 


977 


-979 








981 


984 


986 


-987 


990 


992- 


-993 


995 








997 


999 


-1001 1005-1007 1009 










1012-1013 1018 


1020 


1022-1024 








1026 1029-1 


030 


1033 


1038 1041 
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Tissue Orxgin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 

■ 










1045 


1047 


-1050 


1052 


1054 


-1055 








1059 


1063 


-1064 


1067 


-1071 


1073- 








1074 


1078 


1085 


1087 


1089 


1095- 








1097 


1104 


1106- 


■1107 


1109 


1112 








1116 


-1117 

• 


1119 


1126 


1134 


-1135 








1139 


1141 


-1142 


1144 


-1145 


1148 






- 


1152 


-1153 


1156- 


•1158 


1167 


1170 








1172 


1178 


1195-1196 


1198 


-1200 








1202 


1204 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


1252 








1257 


-1258 


1265 


1267 


-1270 


1276 








1278 


1280 


-1281 


1283 


1285 


1288- 








1289 


1295 


1300 


1305 


1308 


1312 








1317 


-1321 


1329 


1338 


-1339 


1341 








1344 


-1346 


1349- 


1351 


1353- 


-1355 








1357 


1365 


-1366 


1369 


1378 


-1379 








1383 


-1385 


1394 


1397 


1400 


1402- 








1403 


1408 


1417 


1419 


1423 


-1426 








1431 


1433- 


-1436 


1438 


1444 


1446- 








1448 


1454< 


-1455 


1460 


1466 


1468 








1470 


1474 


1480- 


1481 


1483 


1486- 








1488 


1490 


-1491 


1494 


-1496 


1506 








1508 


-1509 


1511- 


1512 


1515 


-1516 








1519 


1523- 


-1524 


1528- 


-1529 


1536- 








1540 


1546 


1549- 


1550 


1555 


1560- 








1561 


1565 


1567 


1569 


1575 


1588 








1591 


1593-1594 


1596-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624- 


-1625 


1627- 


1632 


1636 


1639 








1644-1645 


1647- 


1649 


1652 


-1653 








1656 


-1662 


1664 


1666 


-1667 


1670- 








1671 


1673-1675 


1678-1679 


1683 








1685- 


-1688 


1690- 


1692 


1696-1699 








: 1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743- 


-1744 








1748- 


-1749 


1753 


1760-1762 


1765 








1767 


1770- 


-1771 


1773 


1775- 


-1776 








1778- 


-1779 


1786 








lymphocytes 


ATCC 


LPC001 


4 11- 


-12 18 24-2 


5 30- 


-31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


■218 222-223 


; 






225-226 229 231 


247 


251 256 264 








272 280-281 284 


300- 


•301 321 325- 








326 339 348 352 


357 


371 382 384 


- 






390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447-448 451 454- 








455 475 503 516 


526-527 530 537- 








540 549 556-560 


563 


574 577 589 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657-659 690 697 




• 




717 723 755 764 


775- 


•777 780 786 








789-790 793 800 


802 


822 838 849 








866 869 876 881 


-883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 


1135 


1140- 


1141 








1143 


1148 


1158 


1163 


1177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1260 


1265 


1269- 








1271 


1290- 


1293 


1308 


1312 


1317 








1319- 


1320 


1339 


1345- 


1346 


1348 








1350- 


1351 


1357 


1367 


1369 


1379 








1381 


1383- 


1384 


1386- 


1387 


1389 








1394 


1397 


1405 


1423 


1425- 


1428 








1431 


1437 


1446 


1448 


1461 


1466 






* 


1470 


1472 


1474 


14 82 


1492 


1506 








1528 


153 7 


1546 


1549 


1591 


1598 








1600 


1603- 


1604 


1606 


1627 


1636 
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Tissue Origin 



leukocyte 



RNA Source 



Hyseq 
Library Name 



GIBCO 



LUC001 



SEQ ID NOS: 



1638 1647-1649 
1664 1676-1677 
168B 1699 1711 
1728 1737 1740 



1651 1658-1659 
1680-1681 1687- 
1715-1716 1726 
1746 1748 1752 



1756 1758 1777 1779 



212-215 
236 247 
274-277 
307-310 



3-4 10-11 13 15-18 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
217-219 222-223 229 235- 
251 255-258 260 262 272 
280-281 285-286 297-301 
313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-3B5 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
4 5 3_454 456 459 461- 
474-479 481 483-485 
499-501 503-504 509- 
522 526-527 529-531 
542 547-549 553-559 
574-577 579 582 584- 
595-597 601-602 604 
611-613 615-621 623 627- 
636-637 642 644-650 655 
662-665 667 669 674-675 
678 682-684 692-696 698 
708 710 716-720 725-726 



442 445-451 
464 468-472 
487-491 496 
513 516-519 
534 536-540 
566-567 571 
586 589 
606-607 
629 633 
659-660 



593 



749 751 
770-778 
796 798 
817 819 
838 843 
877-879 
904-914 
935-936 
955-956 



738-739 743-746 
759 765-766 768 
786 788-790 793 
803 810-811 814 
830 832 834-836 
863-864 866-871 
894-896 898 902 
925 927 930-932 
945 948-949 953 
962 964 967 970-971 973 
985-990 992-993 995-996 
1004-1009 1011 1014 
1022-1023 1025 1027 
1033-1036 1038 
1050 1053-1054 
1062 1064 1068 

1089-1091 
1110-1113 
1125 1129 
1140-1145 
1170-1174 
1182-1183 1186 
1202 1205-1206 



1085-1086 
1106-1107 
1122-1123 
1135-1137 
1163 1168 
1180 
1200 



700 706 
729-736 
753 756 
780 784- 
800 802- 
826 828- 
845-860 
881-892 
916 919- 
941-942 
958 960- 
975 977 
999-1002 
1017-1019 
1029-1031 
1041 1043 1047 
1058-1059 1061- 
1070 1072 1078 
1093 1097 
1115-1117- 
1132-1133 
1152 1158 
1176-1178 
1195 1198- 
1211 1216 
1230-1236 
1254 1256 



1219-1221 1223-1227 
1238-1242 1247 1252 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
1287-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1453 

1470- 

1488 

1506 

1521 

1531 

1549 

1565 

1594 

1608 

1626- 

1639 

1653 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



1458- 

1471 

1490- 

1509 

1522 

1534 

1550 

1567 

1596 

1611 

1629 

1641 

1655 

1675- 

1696 

1716 

1733 

1748 

1765 

1786 



1459 

1474 

1493 

1512- 

1524- 

1538 

1553 

1575 

1598 

1614 

1633.- 

1644 

1658< 

1679 

1700 

1717 

1737- 

1749 

1769 



1463- 

1477- 

1496- 

1513 

1525 

1541 

1555- 

1580 

1600 

1620- 

1632 

1645 

1660 

1684 

1702 

1720 

1738 

1752 

1771- 



1464 
1478 
1501 
1516 
1527- 
1545- 
1556 
1589 
1602 
1621 
1636 
1648 
1662 
1688 
1707- 
1723 
1741 
1755 
•1772 



1468 

1482- 

1504 

1519 

1528 

1547 

1560 

1591 

1606- 

1624 

1638- 

1650 

1659- 

1690. 

1709 

1725 

1743- 

1760- 

1781- 



69 75 82 102 
244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



leukocyte 



Clontech 



LUC003 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



mammary gland 



Invitrogen 



MMG001 



372 
481 
589 
647 



4 35-36 44-4S 61 68 
119 139 154 179 197 
404 430-431 
503 537-540 
608-609 621 
662-664 669 
775-777 B02 848 
905-907 915 949 
1113 1119 1170 
1241 1275 
1377 1506 
1600 1613 
1676-1677 
1738 1772 



324 
477 
581 
632 
773 
879 
1002 
1236-1237 
1357 1359 
1553 1591 
1628 1670 
1699 1733 



25 35-36 
163 166 
271 277 
345 351 
415-416 
481 490 



43 80 
188-189 
280-281 
372 380 
430 445 
499 503 
575-576 588 
665 734-735 
800 832 845 
887 905 914 
990 992 999 
1050 1055 



104 126 128 150 
197 210 215 220 



567 
660 
790 
883 
985 
1038 
1099-1102 1107 
1156 1163 1172 
1214-1215 1217 
1238-1239 1244 
1293 1311 1320 
1345 1355 1367 
1403 1406 1414 
1465 1521 1529 
1547-1548 1582 
1638 1647 1653 
1670 1680-1681 
1724-1725 1731- 
1761 



317 
383 
454 
526 
613 
759 
859 
934 



336-338 
387 412 



456 
546 
615 
778 
869 
958 



467 
548 
647 
787 
878 
976 



310 
-381 
448 
526 
601 
737 
856 
932 

-1000 1025 1031 
1068 1074 1088 
1136-1138 
1190 1195 
1226-1227 
1253 1278 
1330 1334-1335 
1386-1387 1394 
1423 1437 
1536 1539 
1620 1626 
1660 1667 
1696 1704 
1732 1750 



1149 
1200 
1235 
1280 



1442 

1541 

1631 

1669- 

1715 

1760- 



5-8 10 12 
33-39 42- 
71 73-74 
106 108 1 
146 148 1 
166 170-1 
188-190 1 
222 224 2 
251 253-2 
271 276-2 



14-18 
43 52 
79-80 
12 123 
50-152 
72 174 
94-198 
27-228 
54 256 
77 279 



20-21 24 
55-58 60- 
82 89 98 
128 133- 
154 158- 
176 178 
201-206 
231 233- 
261-263 
-281 284- 



-25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Tissue Origin 


RNA Source 1 


Hyseq 






SEQ ID NOS: 








Library Name 




















290 297 299 301 


304 309-312 318 








320-321 323-325. 


327-329 331-332 








334 339 341 344- 


•345 348 350 356 








359-360 362-363 


368 . 


371 376 379- 








383 388 390 393- 


-395 


397-398 405 








408 412 414-415 


423 


430 434-437 








441-444 441 


3 451- 


-455 


462-464 474 








476 479 482 485- 


-486 


488 490 494- 








495 4 


38 50 


3 506 


509- 


512 516-517 








519-520 522 527 


529 


534 537-541 








.547 549 554 557 


562 


572-574 587 








589-591 597 602 


607 


618 623 628- 








629 632 63 


4-640 


644 


647-648 650- 








652 655 657-658 


660 


665 6 


67 669- 








672 674-676 679 


682 


688 695-696 








706-707 710 713 


717 


720 722-730 








732-734 73 


6 738 


743 


747-748 750 1 








755 759 76 


1 766 


770 


780 7 


84 786- 








789 7 


94 803 806 


-807 


809 814 817- 








B22 827-829 837 


842 


854-858 863- 








•864 866 869-870 


872 


878 8 


81 889 








893-900 904 906 


-907 


911 916 919 








921-923 926 935 


-937 


946 948-949 








953-954 957 960 


-961 


963 965-966 








970 977-97 


8 984 


-989 


993-9 


97 








1000- 


1001 


1005- 


1006 


1008 


1013- 








1014 


1016- 


1017 


1023 


1025 


1027 








1032- 


1033 


1036 


1039 


1043 


1045 








1055 


1057- 


1058 


1063 


1068- 


1075 








1077- 


1078 


1085 


1087 


1089- 


1091 








1095- 


1102 


1107- 


1108 


1112- 


1119 








1121- 


1123 


1131- 


1133 


1136- 


1137 








1139- 


1142 


1144- 


1145 


1148- 


1149 








1153 


1159 


1167 


1170 


1172- 


1173 








1183- 


1185 


1190- 


1192 


1196- 


1199 




■ 




1207- 


1208 


1212 


1216-1218 


1222- 








1223 


1225 


1231 


1234 


1240- 


-1 OA T 








1247 


1253-1254 


1258- 


-1259 


1261- 








1262 


1270-1280 


1283 


1285- 


-1286 








1298 


1307 


1314 


1316- 


-1320 


1323- 








1325 


1330 


1334- 


1335 


1342- 


-1345 








1349- 


-1352 


1354- 


•1355 


1359 


1369- 






- 


1370 


1377 


1379 


1381 


1383- 


-1384 








1389 


1405 


1414 


1419 


1421- 


•1423 








1425- 


•1426 


1428- 


•1429 


1431 


1434- 








1437 


1439 


1448- 


-1449 


1454 


1457 








1460- 


•1464 


1466 


1471 


1480- 


-1483 








1487 


1489- 


-1491 


1493 


1505 


1507 




* 




1512 


1519 


1526- 


-1528 


1532 


1534 








1536 


1539 


1542 


1547 


1549 


-1550 






1 


1554 


1561- 


-1562 


1564 


1567 


1572 








1576- 


-1579 


1581-15B2 


1587 


-1588 








1592 


1594 


1596- 


-1597 


1601 


-1602 








1607- 


-1608 


1610 


1612 


-1616 


1618 








1621 


-1622 


1625- 


-1626 


1631 


1635- 








1636 


1641 


1643 


-1644 


1647 


1650 








1652 


1654 


-1655 


1657 


-1658 


1660 








1662 


1664 


-1666 


1669 


-1671 


1673- . 








1674 


1676 


-1677 


1680 


-1685 


1689- 








1692 


1701 


1706 


1713 


-1715 


1719- 








1720 


1723 


-1728 


1730 


-1732 


1738 








1740 


1742 


-1744 


1746 


-1747 


1749 








1751 


1753 


1760 


-1762 


1765 


-1768 








1771 


1774 


1776 


-1777 


1779 


1783- 








1784 


1786 










induced neuron 


Strategene 


j NTD001 


29 3 


5-36 


80 116 123 


156 


163 181 


cells 






214 


230 2 


80-281 284 


-285 


307 321 






330 


340 3 


58 371 375 


377 


380 382 








422 


424 4 


92 497 532 


-533 


542 546 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



retinoid acid 

induced 
neuronal cells 
neuronal cells 



549 566 586 595 612 
734 775-778 780 792 
856 858 875 936 953 
1041-1043 1055 1072 
1194 1206 1223 1246 
1288-1289 1291 1294 
1349 1359 1412 1423 
1623 1645 16B4 1705 



645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



Strategene 



NTR001 



5-8 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425- 
1426 1547 



Strategene 



NTU001 



pituitary 
gland 



placenta 



Clontech 



PIT004 



Clontech 



PLA003 



29 
166 
284 
391 
470 
540 
661 
904 
1025 

1219 1225 
1295-1296 
1330 1350 
1383-1384 
1539 1547 
1690 1738 



65-66 80 82 110 119 146 152 
174 181-185 198 227-228 253 
309 325 332 334 336-338 375 
393 406 414-416 454 465-466 
488 503 506 510-512 519 537- 
572-574 597 602 607 623 647 
700 702 716 743 771 792 858 
948 954 977 1000 1005-1006 
1064 1068 1122 1148 1185 
1234 1246 1271 1283 
1311 1317-1320 1329- 
1355 1365-1366 1378 
1400 1412 1445 1505 
1578 1647 1656 1683 
1749 1783-1784 



311 314 379 408 419 
1095-1096 1272-1273 
1378 1652 1671 1720 
1741 1755 



430 454 1055 
1312 1320 
1725 1736 



5-8 124 208 277 370 
1280 1317-1319 1369 
1737 



843 906-907 
1609 1621 



prostate 



Clontech 



PRT001 



rectum 



Invxtrogen 



REC001 



9 46 57 71 107 147 171 177 197 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 

1232-1233 1241 
1287 1295 1313 
1349 1360 1362- 
1367 1437 1442 1447 1475 
1482 1489 1513 1517 
1536 1598-1599 1628 



1196 
1258 
1333 
1363 



1198 1202 
1272-1273 
1341 1344 



1478-1479 
1527 1531 
1636 1657 
1717 1738 



1680-1681 
1743-1744 



1687-1638 



17-18 29 33 
113 126 146 
200 206 261 
373 388 395 
442 446 448 
540 547 567 
629 632 
717-719 721 
756 762-763 
825 843 849 
949 960 986 



309 
408 
4 64 
585 
645-647 



62-63 71 73-74 83 86 
153 158 167-169 195 
341 344 
420 430 
483 517 
602 623 
657-658 
738 748 
774 790 
903 909 



312 
414 
468 
589 
651 



725-726 
766 770 
851 881 
996 1020 



368 

441- 

537- 

628- 

669 

750 

819 

948- 



1034 1064 1067 
1108-1109 1113 
1159 1172 1178 
1205 1220 1225 
1317-1320 1323 
1351 1355 1369 



1023 1033- 
1075 1086 
1139 1153 
1187-1189 
1244 1271 



1070 
1130 
1185 
1240 
1334-1335 1350- 
1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 1 
1 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-1588 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 








1786 


qal 4vaYV til anrl 




SAL001 


10 55 97 103 110 140 149 152 158 








198 217-218 242-243 256 301 308 








312 321 333 351 354 360 410 437 






• 


448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 14B2 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








1671-1672 1691-1692 


salivary gland 


Clontech 


SALs03 


158 326 1423 1463-1464 


skin 


ATCC 


SFB001 


1320 1400 


fibroblast 










ATCC 


SFB002 

W * abV W >/ ab« 


262 736 1025 1253 


Am mLAmfJm dW W 










ATCC 


SPB0Q3 


709 1119 1350 1631 1653 


J. 1JJI UU1SO L. 














?^ 147 146-147 151 155 198 203 








244 260 271 280-281 286 288 298 








101-302 308 312 334 340 371 398 








408 412 414 416 423 426-427 430 




- 




434-435 445 452 454 478. 503 516 








ciq col 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








^Ofi-907 917 919 935 997-998 1000 








10f>7~1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 


■ 






1 169 1199 1219 1234 1247 1264 




- 




1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 




* 




1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 

■ipi ajar |M **W taar 4b> aaat aaW fear aia* 7 ^a* aa» «aF ^ — ^ — ^ ■ ^ 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 




rlonl"eoh 


SKM001 


18 20-21 82 84 101 118 134 148 


mil rar^T 






151 153 166 225-226 25B 274 277 

bJh> aa# aaW «r"a> ■b* t «bb» naa» W N# Aa) *a» ataf abfl 4bal W *mmt* «£a> • ^ r ■ 








289 329 361 412 414 424 440 452 








459 470 488 503-504 537-540 647 








660 673-675 715 773 780 786 830 








905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 




* 




1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMS03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



adult spleen 



RNA Source 



Hyseq 
Library Name 



Clontech 



SEQ ID NOS: 



82 85 92 
167 198 



94 108 
204-205 
280-281 
379 387 



110 
210 
300- 
392 
4 73 
526 
567 
625 



SPLcOl 



259 277 
317 372 

430 433 448 467 
509 513 519 524 
547 549 551 559 
607 616-617 623 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
•302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-650 
673 679 681- 
728-729 734 
781 789 791 
847-849 854- 
•872 875 884 
924 934 942 
993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 



1443 
1501 
1549 
1614 



1448 
1508 
1565 
1625 



1651-1652 
1751 1755 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



stomach 



Clontech 



STO001 



10 15-16 
197 201 
281 287 
426-427 
597 620 
780 782 
967 976 



1071 
1259 
1359 
1487 
1634 



61 68-69 100 117 149 
227-228 231 249 273 280- 
291-292 302 312 358 362 
430 446 462 475 479 535 
630 651 662-664 722 739 
785 846 919 960 964 966- 
1008 1012 1032 1042 1063 
1135 1170 1208 1234-1235 
1277 1280-1281 1322 1349 
1369 1449 1468 1474 1478 
1493 1498 1557-1559 1622 
1651 1653 1729 



thalamus 



Clontech 



THA002 



thymus 



Clontech 



THM001 



9 11 25 
190 198 
239 261 
333-334 
388 393 
477 483 
608-609 
774 782 
899-900 
1034 1055 
1150-1151 
1193-1194 
1305 1345 
1440-1441 
1562 1572 
1614 1640 
1688 1703 
1753 



85 87 112 137 146 180 
206 210 212-213 235-236 
268-269 279 290 301 325 
341 351 356 364-365 379 
396 419-420 441-442 458 
508 525 531 549 567 606 
647 681 715 725-727 736 
784 794 827 883 890-891 
961 997 999-1001 1004 
1097 1129 1144-1145 
1157 1172-1173 1177 
1208 1220 1249 1280 



1355 1369 
1454 1496 
1578 1590 
1651-1652 
1743-1744 



1434-1435 
1546 1549 
1594 1613- 
1671 1687- 
1746-1747 



44-45 54 57-58 62-64 79 104 123 
126 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 526 535 537- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



540 546 
591 604 
649 656 
728 735 
775-777 
824 826 



866 
900 927 
992 999 



548 
612 
660 
739 
780 
828 
870-871 



554 
621 
665 
746 759 
784-785 
845 851 
87B 884 



567 584 
638-640 
670 698 



930-931 967 



762 
800 
858-859 
887 892 
983 986 



586 590- 
645-647 
710 720 
766-767 
802 809 



1014 1029-1030 1033 



864 
899- 
990 
1059 



1066 


1073 


1103 


1107 


1113 


1116- 


1117 


1119 


1140- 


-1142 


1158 


1163 


1172 


1177 


1195 


1206 


1209 


1213 


1216 


1218- 


1219 


1221 


-1222 


1227 


1271 


1277 


1282 


1320 


1329 


1349 


1367 


1369 


1383- 


-1384 


1417 


1419 


1423 


1425-1427 


1448 


1477 


1488 


1493 


1536 


1554 


1620 


1644 


1646 


1649 


1654-1655 


1661 


-1662 


1669- 


1670 


1674 


1676 


-1677 


1685-1688 


1707 


1711 


1731 


-1732 


1737 




5-9 


15-21 


25 33 35- 


36 43 


-45 48 



thymus 



Clontech 



THMC02 



50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 



211 217-219 222 
236 240-241 244 
262 268-269 286 
301-302 309-310 
327 334 342 350 
373 382 384 400 
424 430-431 436 
464-467 470 472 
497 500 504 506 
524 526 530-531 
554-555 565-566 
575-577 586-587 
612 630-632 634 
660 666-667 669 
700 703 708 720 
739 743-744 750-753 
765 767 772-779 787 



224 229 233 235- 
251-252 256 261- 
288 290 295 297 
315-317 321 324 
352-353 360 370- 
403 410 414-416 
445 454-456 461 
474-476 483 488 
513 516 519-520 
534 537-540 549 
569-570 572-573 
595 603-604 606 
636 647 650 
673-675 678 
725-726 731 
757 759 
789-790 



800 810 823 
854-856 859 
890-891 898 
941 949 958 
981 986 988-990 992 
100B 1014 1016 1039 



657- 
698 
738- 
763- 
798 
848 
881 
933 
975 



1074 1079 1089 
1117 1122 1131 
1145 1163 1172 
1196 1198 1206 
1223 1227 1234-1243 
1267 1271 1280-1281 
1308 1317-1320 1322 
1327 1330 1334-1335 
1350-1351 1355 1357 
1374 1377-1379 1386 
1397 1400 1402 



829 834-836 841 
861 864 870-871 
908-909 913 928 
961 963 967 969 

999 1007- 
1041 1073- 
1097 1109 
1140-1141 
1175-1177 
1211 1216 



1392 

1417 1423 1425-1427 
1466 1474 1477 1483 
1504 1506 1525 1536 
1566 1594 1598-1600 
1614 1621 1623 1625 
1641 1644 1647 1649 
1658 1662-1663 1671 
1681 1686-1688 1693 1705 
1711 1717-1718 1726-1727 
1733 1737-1738 1743-1745 
1761 1771-1772 1779 1786 



1114- 
1144- 
1186 
1220 
1261-1262 
1284 1290 
1324-1325 
1339 1346 
1360 1370 
1389-1390 
1406-1407 
1440-1441 
1493 1498 
1545 1549 
1608 1611 
1632 1639 
1653-1656 
1673 1678- 



1707 

1731- 

1758- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



thyroid gland 



Clontech 



THRO 01 



trachea 



Clontech 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152 
153 155-158 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 
262 265-266 268-269 277 
288-289 298-299 



284-286 

311 317 321 326 332 335 
344 348 350 354 358-359 
371-373 382-383 385 394 
401 411 414-415 421 424 
433-436 443-446 450-452 
458 472-474 476-478 482 
487-488 490-494 496-497 
503-504 506 509-513 516-517 
524 526-527 529 535-540 547 
564 569-570 575-576 588 
601-602 604 606 610 612 
628-630 634-635 
660 662-665 668 
696 
734 
761 



256 258 
280-281 
302 309- 
341-342 
363 368 
398 400- 
430-431 
454-455 
484-485 
500-501 
519 



562 
595 
617 
647 
681 

727-729 
745 750 
780 
824 
849 
881 
908 
927 
948 
979 



619-623 
649-651 
690-694 
732 
759 
785 795-796 
826 828 833 



698 
738 
763 
798 
838 



700 709 
740-741 
765 770 
802 804 
841-845 



549 
594- 
615- 
642 
670 
721 
743 
773 
823- 
847 
880- 
898 
926- 



857-860 867 874-875 878 
887-888 890-892 894-895 
510-911 913-914 922-923 
929 932-934 937 939 941-942 
953 957 961 963-964 966 978- 
981-982 987 990 992 1001 
1004-1006 1010 1014 
1033 1038-1039 1044 
1052-1054 1056 1058 
1071 1077-1079 1088 
1105-1106 1112-1113 
1128-1129 
1142-1143 
1156 1161-1164 
1177-1181 1190 



TRC001 



1124 1126 
1136-1137 
1149-1150 
1170-1173 
1197 1200 
1217 1219 
1235 1241 
1258 1260 
1286-1289 
1330-1332 
1349 
1381 



1020 1024 
1047 1050 
1068 1070- 
1094-1097 
1116-1117 
1131 1134 
1146-1147 
1167 
1192 



1365-1367 
1394 1407 1419 
1437 1440-1441 1443 
1454 1459 1461-1462 
1471 1475 1477 1479 
1497-1498 1504-1505 
1522 1524-1526 1528 
1536-1537 1548 
1559 1562 1567 
1597 1599-1601 
1619-1620 1622 
1631-1632 1634 
1645 1648 1651 
1660 1662-1663 
1675 1678-1681 
1691-1692 1703 
1724-1726 1729 
1740 1743-1744 
1761 1770 1777 



1204 1208-1209 1214 
1222 1230 1232-1233 
1245 1247 1254 1257- 
1262 1271-1273 1283 
1299 1306 1314 1320 
1334-1335 1342 1345 
1370-1372 1374 



1550 
1578 
1612 

1624-1626 
1636 1639 
1653-1656 
1667 1669 
1683-1686 
1709-1711 
1734 
1749 
1786 



1428*1436- 
1446-1449 
1468 1470- 
1482 1491 
1507 1513 
1531 1534 
1553 1555- 



1590-1591 
1614 1616 
1628 
1644- 
1658 
1671 
1689 
1717 
1737-1738 
1753 1759- 



9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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Tissue Origin 



rna Source 



Hyseq 
Library Name 



SEQ ID NOS: 



352 372 377 384 414 424 445-446 
454 472 474 491 496 560 579 588 
593 597 607 612 626 681 702 719 
810 859 866 878 894-895 912 916 
922 932 935 1046 1075 1080 1099 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



uterus 



Clontech 



UTR001 



17 19 25 
108 139 
263-265 
446 448 
506 513 
560 601 
773 780 
929 934 
1050 1075 
1258 1279 
1343-1344 
1478 1481 
1552 1579 
1626-1627 



41 
152 
274 
452 
519 
610 
833 
937 



46 57-58 
174 198 



290 
473 
522 
632 
845 
996 



387 
491 
526 
659 
857 



61 89 104 
200-201 206 
408 420 438 
493 499 503 
530 542-543 
665 720 751 
872 877 912 



1107 
1287 
1375 
1498 
1597 
1649 



1009-1011 1018 
1124 1170 1219 
1310 1320 1323 
1437 1451-1452 
1519 1521 1536 
1602 1606 1620 
1652 1661 1670 



1719 1722-1723 



TRADOCS : 14 1 6 1 9 1 . 1 (%CQN0 1 ! .DOC) 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


Y41736 


Homo 

■ 

sapiens 


Human pkoiii4 protein 
sequence • 


1 1 Q D 

1398 


1 oft 

100 




IODDDD 


homo 


roeiriDrane- couau protein 


z j ay 




3 


AF113136 


Homo sapiens 


IL-1 receptor-associated- 


3043 


100 


4 


AF017806 


Mus musculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


tinronectm precursor 


1053d 


98 


4 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


89 


8 


X02761 


Homo sapiens 


f ibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 IS -encoded protein. 


2381 


100 


11 


AFH7754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankynn 3, 
node of Ranvier (ankyrin 
G))) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression prgb-13 . 


"1 AAA 

1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


16 


AF233453 


Homo sapiens 


RACK-like protein PRKCBP1 


3124 


99 


17 


1\ Tift f\ *1 -*| 

AF201303 


Homo sapiens 


dhfr oribeta-binding protein 
RIP60 


3130 


ft ft 

98 


18 


AF064205 


■ -.- — — — j-v t 

Homo sapiens 


dynactin 1 pl50 isoform 


6377 


■i ft ft 

100 


19 


U00059 


Saccharomyce 
s cerevisiae 


Yhrl21wp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/calmodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca2+/ calmodulin -dependent 
protein kinase kinase beta 


2300 


99 


ft J* 

24 


)» T*ft O ft T ~* ~f 

AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 


ft n 

99 


25 


U33460 


Homo 
sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43 701 


Homo sapiens 


ribosomal protein L23a 


791 


m ft ft 

100 


2B 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


ft ft 

99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


715 


90 


J l 


W71749 


T T Mwm ^Hft. Am*. A*m mm. m Mt±. mm* *m\ 

Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


oil 


a A 





AF231917 


Homo sapiens 


long -chain 2 -hydroxy acid 
oxidase HA0X2 


toil 

loll 


100 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 1 
dioxygenase 


i e a *7 
1507 


9 9 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 

CO 60 / J 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 
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SEQ 
ID 
NO: 



39 

"40" 
~41 



42 



43 



44 



45 



46 



47 



49 



51 

52 



53 



54 



59 



60 



62 



63 



64 



65 



66 



"69 



70 



71 



72 



73 



74 



75 



76 



77 



78 



79 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



Y78795 

U93121 
Y42750 



Homo sapiens Human antizuai-2 (AZ-2) ammo 

aci d sequence, 

Homo sapiens M-phase phospnoproteln-i 



SMITH - 
WATERMAN 

SCORE 
3556 



3747 



AF28262 6 
G02150 



U19617 



Homo sapien s Human calcium binding protein j 795 

1 1 (CaBP-1) . 

Ho mo sapiens | latexin — ^ B 

Homo sapiens Human secretea protein, SEQ i 3 8 4 

? ID NO: 6231. _____ 

. ■ ■ l 2724 

l us mus cuius I Elf -1 

Mus mus cuius El£-1 



U19617 . » _ — -— 

AF100758 Homo sapi ens osteoinductive factor OIF 



2062 



1538 



Y87591 



X04145 



Homo sapiens Human SPROUTY-l protein, SEQ 1737 

| ID NO: 24. 1 

Homo sapiens T3 gamma precursor (aa -_2 to j 94* 



160) 



X63547 
M94043 



L31783 



Homo sap iens | oncogene __ _ 
" Rattus rab- related GTP-r>maing 

aorve gicus | protein 
Mus musculus I uridine kinase 



5845 



X83973 



AF224741 
W74805 



Z50907 
D79994 



D79994 



Y59738 



H omo sapiens transcription factor 
Hom o sapiens | chloride channel protein 7 
Homo sapiens Human secreted protein 

encoded by gene 77 clone 

t HOEAS24 . _ 

Homo sapiens I Human TBC-1 cDNA from secon 

I transcript . 

Homo sapiens similar to ankyrin ot 

Chro ma tium vinosutiK 
Homo sapiens I similar to ankyrin 01 

Chr oma tium vinosum. 

Homo sapiens | Human normal ovarian tissue 

derived protein 15 . 



1089 

"9lT 
4486 



4128 
1491 



4824 



AB031069 | Homo sapiens \ protein containing cxxc 

domain 1 



6089 

* 

4014 
"60T 
1390 



Y66660 



Y66660 



S70011 
AF139518 



^29666 

AJ245738" 
AF09913 8 



Homo 
sapiens 



Membrane -bound protein 
PR0783. 



2492 



Homo 
sapiens 



Membrane -bound protein 

PRQ783. 

tricarboxylate carrier" 



1709 



AF099138 



Z82059 



AF224278 



AF126426 



Y41652 



Y41652 
AF188622 



AE000406 



X99302 



AL136538 



Homo sapiens Homo sapiens DH13U8_> clone 

1 secr eted protein. 

Homo s apiens | claudin-15 _ 

Rattus | GLUT4 vesicle protein 

norvegicus 



157 



1206 
4183 



Rattus I GLUT4 vesicle protein 

norvegicus 
Caenorhabdit 



4906 



is elegans 



Homo sapiens 



Homo s apiens 
Homo 
sapiens 



Homo 
sapiens 
"Mus musculus 



Escherichia 
c oli 

Homo sapiens 



Schizosaccha 
romyces 
pombe 

AF129756 \ Homo sapiens 



Similarity to Drosophila ring | l2Bb 
canal protein comes from 
this gene 



PMEPA1 protein 

n eurotrimin 

Human MEK2 protein sequence 



1282 
1809 
2065 



Human MEK2 protein sequence. 



1207 



selectively expressed in I 1485 

embryonic epithelia pro tein- 1 
— 1 9S0 



putative DNA topoisomerase 



Popl 

similarity to S. cerevisiae 
ktil2 protein 



655 



210 



G4 



1554 



% 

IDENTITY 
TT 



100 



100 



100 



94 

88" 



86 



100 



99 



99 
~99 



96 



100 



99 
~91 

Too 
TocT 



99 



99 



Rattus sp. | tricarboxylate carrier i 895 155 

~ Rattus [A-kinase anchor protein I 178 I 24 

norvegicus 



30 



100 
87 



86 



44 



100 



100 



31 



99 
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' SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECTER 


LJClZi <^t\X IT 1 J.\J1M 


orlX in- 
WATERMAN 
SCORE 


IDENTITY 


80 


AL096768 


Homo sapiens 


dJ858B16.2 

fohosrihatidvl Rpri np» 

decarboxvlase (PSSC. EC 
4.1 .1.65) ) 


2033 


100 


81 


AL096768 


Horao sapiens 


dJ858B16 .2 
(phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1.1.65) ) 


1220 


96 

• 


82 


X57351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984_l 


2700 


98 


84 


X73113 


Homo sapiens 


fast MyBP-C 


5959 


99 


85 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; 
CLIC4 


1305 


99 


86 


AB01B423 


Mus mus cuius 


SH2 domain- contain! no nrof pi n 


1 3fi0 

X J D U 


7ft 


87 


AF272151 


Homo s ao i en s 


adaotor nirotein CIKS 


30A4 


-* ^ 


88 


AF196329 


Homo 
sapiens 


t jricfcrerincr receotor exDressed 
on monocytes 1 


1214 


100 

X U V 


89 


AB016879 


Arabidopsis 
thaliana 


contains similarity to pre- 

mRNA splicina 

f actor-gene_id:MRBl7 . 2 


634 


36 


90 


AJ133721 


Mus musculus 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSF 


11676 


99 


93 


Y99365 


Homo saoiens 


Human PRO1250 (UN063li amino 

acid sequence SEQ ID NO: 86. 


J o u 


inn 


94 


Y87231 


Homo saoiens 


Human sicmal t5f»nt"Trfp» 
containincr protein HSPP-8 
SEQ ID N0:8. 




i nn 

J.WW 


95 


AF227741 


Rat tus 
norvegicus 


protein kinase WNKl 


2428 




96 


AF227741 


Rat tus 
norvegi cus 


protein kinase WNKl 


1961 


94 

• 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


| 98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 


3423 


100 


99 


AC005783 


Homo satjiens 


R33083 1 


X 7 / *± 




100 


Y95293 


Homo sapiens 


Human GEP containing NEK- like 
kinase aubetrahp eflwif 


4092 


99 


101 


AL118501 


Homo s aniens 


dJ1191N16 1 <A novel Drohein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


X J VJ _7 


i on 


102 


AJ006267 


Homo sapiens 


ClnX-like protein 


3233 


100 

V Vr 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


■ 2042 


96 


104 


AB015982 


Homo sapiens 


serine/ threonine kinase 


4718 


100 


105 


AF151074 


Homo sapiens 


HSPC240 


831 


64 


106 


M35522 


Canis 

familiar is 


QTP-bindincr t3irotein (rah7^ 


3 


^0 


107 


R99800 


Homo sapiens 


NTII-1 nerve nrotein. 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome b5 reductase 
isof orm 


1290 


93 


109 


AC005614 


Homo sapiens 


F23269_2 


3369 


99 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 


X52425 


Homo sapiens 


interleukin 4 receotor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
secruence . 


2285 


100 


113 


W15506 


Homo s ap i en s 


Mitocren activatincr Drotein 
kinase ERK1 . 


1991 


100 


114 


Y71071 


Horao sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


117 


W30891 


Homo 


Human cytostatin III protein. | 715 


99 



145 
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TABLE 2 



PCT/USOO/34263 



118 



121 



122 



123 
124 



125 



126 



127 



ACCESSION 
NUMBER 



AF116618 Homo sapiens PRO1038 



Y08915 
AFO98070 



AF052432 



AF083246 



Y27096 



M63109 



U75467 



SPECIES 



sapiens 



Homo sapiens 

Drosophila 

melanogaster 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



1469 



alpha 4 protein 
Lisl homolog 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Leishmania 
major 



Drosophila 
melanogaster 



Z68220 



2637 



NSEQ gene associated with 
matrix remodelling. 



833 



(ACVRP) 



glycoprotein 96-92 



438 



IDENTITY 



100 



37 



99 



IT 



43 



128 



129 



130 



133 



134 



135 



136 



137 



AF095927 



Rattus 
norvegicus 



W92958 



Homo sapiens 



AF115391 



Lactobacilli! 
s sakei 



ribokinase RbsK 



508 



X93498 
X93498 



Homo sapiens 
Homo sapiens 



1250 
916 



W52811 



Homo sapiens 



Human DBI/ACBP -like protein 
(DBIH) 



Y84444 



Homo sapiens 



M69181 



Homo sapiens 



W74882 



Homo sapiens 



W78200 



Homo sapiens 



Amino acid sequence of a 
human RNA- associated 

p rotein. 

non-muscle myosin B 



189 



Human secreted protein 
encoded by gene 154 clone 
HE6FL83. 



480 



Human secreted protein 
encoded by gene 75 clone 
HHGAU81 . 



855 



20 



100 



99 



138 



139 



AL033520 Homo sapiens 



AF020261 



Santalum 
album 



proline rich protein 



119 



30 



140 



141 



142 



143 



X70394 



Homo sapiens 



Y06439 



Homo sapiens 



936 



Z68493 



AB0181O7 



Caenorhabdit 
is elegans 
Arabidopsis 
thaliana 



predicted using Genef inder 

ADP-ribosylation factor- like 
protein 



596 



65 



144 



145 



146 



147 



148 



149 



150 
T51 

TsT 



153 



AF161483 



Y84902 



Homo sapiens 



A. human proliferation and 
apoptosis related protein. 



AB004906 



AC007357 



W75155 



AF056490 



Y58171 
U10397 



Ipomoea 
purpurea 



transposase 



Arabidopsis 
thaliana 



F3F19.18 



Homo sapiens 



Homo sapiens 



Homo 
sapiens 
"Saccharomyce 
s cerevisiae 



Homo sapiens 



Homo sapiens 



Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 
cAMP-specif ic 
phosphodiesterase 8A 



Human hydrolase homologue 
HHH-7. 
Yhrl48wp 



146 



647 
1494 



3710 



phosphotyrosyl phosphatase 

acti vator 

dJ382I10.5.1 (novel protein 



785 
"515 



1719 
"2034* 



20 



31 



98 



99 



99 



99 
"99 



146 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








similar to arginyl-tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6. 


14 71 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor-2 (PKI-2) . 


383 


100 


160 


J0497Q 


Homo saoiens 


J-» JuF Y ^J\^mf*I I— -J- \m* f * »H> ^ 11 •*» W *>^» 1*1 O i. 


« «S S 3 


i no 


161 


W54040 


Homo saoiens 


Human interferon-inducible 
protein, HIFI. 


484 


J o 


152 


AL022724 


Homo saoiens 


dJ413H6 1 1 fhamster 
Androgen- dependent Expressed 
Protein LIKE PUTATIVE 
protein) (isoform 1) 


X J J » 


inn 


163 


AF125535 


Homo sapiens 


dd21 homolocr 


193 




164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713 . 


463 


97 


165 


AJ250839 


Homo sapiens 


serine /threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944 530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo saoiens 


Secreted orotein encoded bv 
gene 112 clone HUKFC71. 


1084 


100 

t 


169 


AF214731 


Homo saoiens 


ATP-deoendent RNA helicaae 


4409 


i no 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 




XDD 


■ 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by qene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 

al* W W 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo saoiens 


GTP-bindinci urotein 


1205 


100 


176 


W90338 


Homo 
saoiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo saoiens 


Human ch aim el - re 1 a t e d 
molecule HCRM-3 . 


1122 


100 

X \J \J 


178 


Y41674 


Homo sapi ens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Komo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B- chain precursor 


1240 


100 


181 


U57344 


Mus mus cuius 


Me is 3 


1813 


89 


183 


U57344 


Mus mus cuius 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 

is* W 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 ^ 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 

*w w w 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium- binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 



147 
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TABLE 2 
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SEQ 
ID 

NO: 



194 



ACCESSION 
NUMBER 



SPECIES 



AF084259 Mus mus cuius 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



brornodomain- containing 7 
protein BP75 



693 



54 



195 



196 



197 
198 



200 



201 



202 



203 



205 



207 



208 



209 



213 



214 



215 



216 



217 
"218" 



219 



220 



221 



222 



223 



224 



225 



226 



227 



228 



229 
230 



231 
232 



233 



234 



Y00752 



W95349 



Rattus 
norvegicus 
Homo sapiens 



serine dehydratase (AA 1 - I 994 

327) t 

Human foetal brain secreted ] 2596 
protein fh!70_7. 



AB02885 9 
W95633 

Y44277 



AB030039 



Ho mo sapiens 
Homo sapiens 

Homo 
sapiens 
Homo sapiens 



hPj9 _ 

Homo sapiens secreted protein 
gene clo ne hm236__l. 
Human nucleic acid methylase- 
2. 

hPACPLl 



1890 
1614 

"209< 



2258 



X54162 



G02061 



Homo sapiens 
Homo sapiens 



64 Kd autoantigen 



2918 



Human secreted protein, SEQ 
ID NO: 6142. 



558 



X13885 



J04204 



Nicotiana 
t aba cum 
Bos taurus 



extensin (AA 1-620) 
32 kd accessory protein 

— 



185 



1837 



J04204 



Bos taurus 



32 kd accessory protein" 

. . ■ — ——————— ■ 



1101 



Y87283 



Homo sapiens 



Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 



1318 



Y02860 



Homo sapiens 



Fragment of human secretedT 
pro t ein encoded by gene 65. 



936 



AL121889 



Homo sapiens 



dJ1076E17.1 (KIAA0 823 protein 
(continues in AL023803)) 



694 



AF22S732 
X66295 
Z29328 



Homo sapiens 
Mus rausculus 
Homo sapiens 



Z29328 



Homo sapiens 



NPDQ07 

Clq C chain 

Ubiqui tin- conjugating enzyme 

UbcH2 

Ubiqui tin- conjugating enzyme 

UbcH2 



1345 
970 
966" 



AJ002030 



X70649 



Homo sapiens 
Homo sapiens 



progr e sterone binding protein 



AF250558 



AL02145 3 
Y08565 



Y9445: 



AL035521 



AL031786 



Homo sapiens 
H omo sapiens 
Homo sapiens 



Homo sapiens 



member of DEAD box protein 

f amily _ 

claudin-2 



d J8 2 1D11.1 (PUTATIVE protein) 
UDP-GalNAc -.polypeptide N- 
acetylgalactosaminyltransfera 

se . i 

Human inflammation associate* 

protein 



Arabidopsis 
thaliana 



putative protein 



AL109736 



X52493 



AL035659 



AB032401 



AB032401 



X83502 



X83502 



AF14372 3 
Y66677 



AB027466 
W95634 



W00365 



Y53762 



Schizosaccha 

romyces 

pombe 



putative proline- trna 
synthetase 



Schizosaccha 
romyces 

pombe 

Glycine max 



WD repeat protein 



DNA- directed RNA polymerase 



Homo sapiens 
Mus rausculus 
Mus mus cuius 



"dJ979N l.l (dJ979Nl>l) 
mmDj4 
miuDj4 



Saccharomyce | J1007 
s cerevisiae 



Saccharomyce 
s cerevisiae 



J1007 



Homo sapiens 
Homo 
sapiens 



Tie at shock protein HSP6 0^ 
Membrane -bound protein 
PR0828. 



542 



1163 



3933 



1169 



259 
3331 



2067 



315 



811 



626 



136 
5199 



1761 
1988 



112 



79 



2557 
982 



Homo sapiens 
Homo 
sapiens 



Homo sapiens 



Homo sapiens 



spondin 2 
Homo sapiens 

protein. 

Human eye 1 in Bl . 



secrete< 



A GTP-binding polypeptide 



2218 



1017 



100 



99 



100 



99 



99 



33 



100 

Too* 



100 



98 



54 



76 



73 

Too 



98 



100 

Too" 



99 



100 
99" 



100 



42 



41 



40 



26 



25 




99 

Too 



99 



100 



148 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


1" % 
IDENTITY 








designated RAQ. 






235 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 


236 


Z50749 


Homo saoiens 


veast sds22 homoloa 


1754 


98 


237 


AB026491 


Homo sapiens 


PICK1 


2137 


100 


238 


AJ270205 


Entodinium 

caudatura 


Oil tA ti 

pho spha t i dyl inos i t ol - 4 - 
Dhosohate 5 -kinase 




o t 


239 • 

mm m* 


AB030189 


Mus musculus 


contains transmembrane (TM) 
recrion and ATP bindincr recrion 


710 


93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HTPl 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


243 


AF155107 




MV-PFM-77 ant* *i erfan 


i nnt; 




244 


AL031320 


Homo sapiens 


dJ20N2.1 (novel protein 
similar to yeast and 

deaminase) 


763 


99 


245 


U3 702 6 


Rat* tu*5 
norvegicus 




IDA 


7. 0 

J V 


246 


AL078599 




similar to CI - elecrans 
F55A12.9 (Tr:P91085) ) 






247 


U32274 


S a c c ha romvc e 
s cerevisiae 


Ydr386wo; CAI • 0 12 

Wk Am mm* \m y \m* «* W § Vf*^ ■ W — 


191 

•mm mm* ~4m 


3 7 


248 


Y41719 


Homo 
sapiens 


Human PR0864 oirotein 
sequence . 


1A79 


100 


249 


AB029434 


Homo saDiens 


ahrelin tire cursor 


611 


100 


250 


X97831 


Rattus 
nor*vecri cus 


carni tine/ acyl carnitine 


246 


38 


251 


W80993 


Homo 


Human RIP- interacting factor 

RIF 


1724 


100 


252 


Y94873 


Homo 

sapiens 




X o / o 


inn 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone ATP-2 (HEBGM4 9^ 


765 


100 


254 


AL354533 


Leishmania 

man of 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
N0:1. 


2247 


99 


257 


AL035539 


Arabidopsis 

thai iana 


putative amino acid transport 

nvoh pi n 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61. 


1171 


100 


' 559 

mm mmm* 


AL03^689 


Homo s aniens 


filiTJI ft 7iTl 1 "J fnotrf*»1 nrn(*Pi'n 
similar to lorotein kinase C 
inhibitors) 




i no 


260 


AE000909 


Methanobacte • 

• m^m* *^^^*^*mm)^m r ^w V* 

riiim 

thermoautotr 
ophicum 


serine/ threonine orotein 
kinase related protein 


363 


30 


261 | 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.l (novel protein) 


821 


100 


264 


AL022318 


Homo sapiens 


DK150C2.3 (PUTATIVE novel 
protein similar to AP0BBC1) 


1072 


100 


265 


AF205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500L14.1 (novel protein) 


7S9 


100 


267 


AL034548 


Homo sapiens 


dJ1103G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


1888 


99 



149 
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268 



ACCESSION 
NUMBER 



AF161470 



AF161470 



SPECIES 



Homo sapiens I HSPC121 



HSPC121 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



1884 



1232 
2lW 



98 



96 



270 



X90763 



Homo 
sapiens 



HHa5 hair keratin type I 
intermediate filament 



271 



272 



AF207600 



Homo sapiens 



M32334 



Homo sapiens 



intercellular adhesion 
molecule 2 



143 6 



AF161483 



100 



274 



276 



Y53052 



Homo sapiens 



df202_3 protein sequence SEQ 
ID NO: 110. 



Y77576 



Homo sapiens 



Human cytoskeletal protein 
(HCYT) (clone 2195418) . 



277 



278 



AF077042 



Homo sapiens 



Y94907 



Homo sapiens 



30S ribosomal protein S7 

homol og 

Human secreted protein clone 
cal06_JL9x protein sequence 
SEQ ID NO:20. 



1619 



98 



279 



280 



281 



Y68788 



Homo sapiens 



Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 



Z75134 



Cams 

familiar is 



rod transducin 



Z75134 



Can is 
famil 



laris 



1816 



282 



283 



AF249873 



Homo sapiens 



AL050007 



Homo sapiens 



98 



284 
"285" 



AF201931 



Homo 



DC1 



AF156102 



Homo sapiens 



1318 
1250" 



99 



286 



Y35897 



Homo sapiens 



Extended human secreted 
protein sequence, SEQ ID NO. 
146. 



287 



U88964 



Homo sapiens | HEM45 



598 



289 



290 



AJ011098 



Y66724 



Homo 
sapiens 



AF034801 



Homo sapiens 



Membrane-bound protein 
PR0836. 



292 



293 



294 



AF034801 



AL049851 



Y73348 



Homo sapiens 
Homo sapiens 



dJ889J22B.l (novel protein 
(isof orm 1) ) 



1738 



Homo sapiens 



HTRM clone 839651 protein 



1245 



sequence 



1694 



99 



296 



297 



AL035423 



Homo sapiens 



carrier protein-1 (BMCP1) ) 



AF198532 



Homo sapiens 



lymphoid enhancer binding 
factor-1 



2173 



100 



299 



300 



301 



302 



303 



3 04 



305 



306 



308 



309 



AF159141 



Homo sapiens 



breast cancer metastasis- 
suppressor 1 



U26397 



Rattus 
norvegicus 



inositol polyphosphate 4- 
phosphatase 



160 



AF036145 



Z82022 



AF269232 



AJ222644 



AF054180 



AJ272079 



Y44486 



AJ131891 



Homo sapiens 



meningioma -expressed antigen 



3458 



Homo sapiens 
Mus musculus 



Arabidopsis 
thaliana 



Homo 
sapiens 



Homo 
sapiens 



Homo sapiens 



butyrophilin-like protein 
BUTR-1 



asparaginyl-tRNA synthetase 



hematopoietic cell derived 
zinc finger protein 



Human GPRW receptor 
polypeptide. 



DNA polymerase mu 



271 



659 



3056 



2598 



30 



100 



50 



50 



79 



100 
100 



100 



150 
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TABLE 2 



SEQ 

Tfi 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


•iin 

•J J. U 


rtfii^J J J 3 


noino sapiens 


p3 0 D3C 


1248 


92 1 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 




VCTOfl O 
A3 / OUZ 


riomo sapiens 


immunoglobulin lambda light 
cnain 


i 959 . 


! 81 1 


TIT 
^ j- J 


ii JO / 13 


noroo sapiens 


wet 


j 2048 


98 1 






nomo sapiens 


HbPC047 


727 


100 j 


315 


AF208068 


Homo sapiens 


Jcelch-like protein KLHL3a 


3046 


100 


316 


Y66666 


Homo 

• 

sapiens 


Membrane -bound protein 
PRO1013 . 


1166 


100 j 


j± t 


1 2 y ODD 


Homo sapiens 


Human Ras protein RAPR-i. 


1253 


98 


i *i a 

Jla 


AU387747 


Homo sapiens 


sialin 


2614 


99 | 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 j 


320 


Y68773 


Homo sapiens 


Ammo acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative TH1 protein 


3013 


100 ~~| 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 j 


323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 ] 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


*1 O I - 

325 


Y94944 


Homo sapiens 


Human secreted protein clone 
bfl57^16 protein sequence 
SEQ ID N0:94 . 


2305 


98 j 


32b 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


6728 


99 1 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 L 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin-related tumor 
suppressor 


569 

* 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 j 


330 

■ 


Z75330 

• 


Homo 

sapiens] 

>R65207 

R65207 02- 

MAR-1995 27- 

AUG-1993 

Human 

stromalin-l. 
[Homo 

• 

sapiens 


nuclear protein SA-1 


6492 


99 | 


Til 

33 J. 


AL008583 


Homo sapiens 


dJ327J16.3 (supported by 
GENS CAN, FGENES and GENE WISE) 


2133 


99 


"3 T O 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489 . 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 \ 


11 A 


At J.bbb98 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the 
Eimeria tenella gene etlOO 


154 


it 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence . 


3386 


97 


3J / 


1 85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence . 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence . 


3447 


98 




Zbb561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 

A/1 QCA "7 \ 
fVl J O % 1 ) . 


716 


34 j 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 | 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 - 


84 1 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 



344 



345 



346 



347 



348 



350 



351 



352 



353 



354 



355 



35S 



357 



358 



359 



360 



361 



362 



363 



364 



365 



366 



367 



368 



ACCESSION 
NUMBER 



U10281 



AK000404 



L22557 



L22557 



AL049481 



AJ251516 



AK024477 



U50133 



AK000625 



AF161420 



AJ010014 



AF151029 



AL022327 



W78128 



X03414 



AF151079 



Y538B6 



AF254741 



AF213465 



AF181562 



AF181562 



U73200 



SPECIES 



Sus scrofa 



Homo sapiens 



Rat t us 
norvegicus 



Rattus 
norvegicus 



Arabidopsis 
t haliana 
Mus musculus 



Homo 



Homo 



sapiens 
sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



% 

IDENTITY 



VDJ region 



gastric mucin 



:79 



unnamed protein product 
calmodulin-binding protein 



1177 



1949 



calmodulin-binding protein 



2363 



AIGl-like protein 



316 



"cysteine and histidine-rich j 1460 
protein 



FLJ00070 protein 



1773 



a nkyrin 

unnamed protein product 



502 



721 



HSPC302 



2623 



M96A protein 



1269 



HSPC195 



941 



Homo sapiens | dJ355C18.1 (KIAA0027) 



1911 



Homo sapiens 



Drosophila 
melanogaster 



Human secreted protein 
encoded by gene 3 clone 

HOSSI96 . 

Kr polypeptide 



1117 



316 



Homo sapiens 



Homo sapiens 



HSPC245 

A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 



643 



530 



Drosophila 
melanogaster 



Centaur in Gamma 1A 



681 



Homo sapiens | dual oxidase 



2016 



-Homo sapiens I proSAAS 



1319 



AF263744 



U37501 



Homo sapiens 
Mus musculus 



proSAAS 



1024 



Homo sapiens 



Mus musculus 



p!16Rip 

erbb2- interacting protein 

ERBIN 

"laminin alpha 5 chain 



884 



4973 



5867 
549 



24 
99 



84 



91 



30 



99 



100 



33 



100 



97 



47 



91 



100 



100 



45 



100 



46 



100 



100 



99 



82 



99 



72 
36 



369 



370 



371 



AF043695 



Caenorhabdit 
is elegans 



similar to the protein 
phosphates 2c family 



Y73440 



Homo sapiens 



AF272833 



Homo sapiens 



Human secreted protein clone 
yj23__l protein sequence SEQ 

ID NO: 102 . 

misato 



1484 



2869 
3927 



99 



97 
100 



372 



373 



AF198454 



Homo sapiens 



epithelial protein lost in 
neoplasm beta 



Y73345 



Homo sapiens 



HTRM clone 438283 protein 
sequence . 



273 



80 



374 



375 



376 



377 



378 
379" 



AF169017 



A95106 



W74828 



Homo sapiens 



formiminotransf erase 
cyclodeaminase 



2717 



unidentified RED ALPHA 



1202 



Homo sapiens 



Human secreted protein 
encoded by gene 100 clone 
HLQAB52 . 



1012 



Y32131 



M14912 



AF090934 



Homo sapiens 



Homo sapiens 



Human LYST-2 protein 
pol ~~ 



3556 



132 



Homo sapiens I PRO0518 



3 82 
2499 



98 



99 
"99 



99 



86 



100 
100 



380 



381 



382 



383 
"384 



385 



X66363 



Homo sapiens 



serine/ threonine protein 
kinase 



Y41699 



Homo 
sapiens 



Human PRO703 protein 
sequence . 



2362 



AF174498 



Homo sapiens 



U64608 
U50133 



Caenorhabdi t 
is elegans 
Homo sapiens 



GR AF-l specific protein 

phosph atase 

coded for by C. elegans cDNA 
ykl73cl2.5 
ankyrin 



7008 



246 
TQ2 



AJ238520 



Homo sapiens 



putative transcription 
factor-like nuclear regulator 



4123 



100 



98 



36 
3T 



97 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


99 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


1616 


62 


395 


AF181721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence o£ a 
human beta IV- spectrin 
protein. 


1626 


98 


397 


U48238 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


398 


AL3 90137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase ; similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


10246' 


62 


403 


AL133288 


Homo sapiens 


dJ671D7.1 (similar to 
D . melanogaster CG5986 
protein) 


761 


100 


404 


268753 


Caenorhabdit 
is elegans 


ZC518.3b 


888 


48 


405 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


1168 


100 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69 . 


1538 


99 


409 


Z18361 


Ovis aries 


t r i chohyal in 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL031658 


Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3) 


776 


98 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo sapiens 


3 -methyl crotonyl - CoA 
carboxylase biot in- containing 
subunit 


2961 


99 


416 


U43503 ; 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t26fl7.2l 


239 j 


35 


418 


YO8100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


U15131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 | 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1084 


55 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 



427 



428 



429 



430 



431 



432 
"433 



434 



437 



438 



439 



440 



443 



444 



446 



447 



443 



449 
450 



451 



452 



4 53 



454 



455 
456 



457 



458 



461 



462 



463 



464 



465 



ACCESSION 
NUMBER 



SPECIES 



AF279144 | Homo sapiens 



AE003683 



Drosophila 
melanogaster 



DESCRIPTION 



tumor endothelial marker / 
precursor 



CG8312 gene product 



Y07829 



AF096897 



Homo sapiens | RING finger protein 
Drosophiia 



pushover 



melanogaster 



U41387 



Homo sapiens | Gu protein 



AF023674 
AF146760 



Homo sapiens | nephrocystln 
Homo septin 2-like ce 

sapiens | control pro tein 



AB006697 



Y94247 



Arabidopsis cleft lip and palate 
thaliana | associated transmembrane 

protein- like ^ 

Homo sapiens Human calcium binding protein 



hCBP. 



AB040672 



AF105228 | Bos taurus 



Homo sapiens | UDP-GalNAc: polypeptide w- 

acetylgalactosaminyltransfera 

se 

tuf telin 



SMITH- 
WATERMAN 
SCORE 



T 



IDENTITY 



1259 



14 9 



2201 



4442 



4021 



3783 



2284 



886 



1704 



1075 



285 



R06463 



X14971 
X53773 



Homo sapiens Derived protein of clone 

I ICA13 (ATCC 40553) . 

Mus musculus | alpha-adaptin (A) (AA 1-977) 



3073 



4897 



Rattus 
norvegicus 



Y66689 



Homo 
sapiens 



alpha-c large chain (AA 1- 

938) 

Membrane-bound protein 
PR0113 6. 



3979 



3299 



AC067754 



Arabidopsis | unknown protein; 20348-23707 
thaliana 



114 



"445 AF229032 j Mus musculus | plL 



AF056035 



s-nexilin 



AF132484 



Rattus 
norvegicus 

Mus musculus I unknown 



2077 
2662 



4 78 



W89024 



Homo sapiens j Polypeptide fragment encoded 

by gene 156. 



528 



AF161445 



Z68753 



W39160 



W85727 



H omo sapiens | MSPC327 
Caenorhabdit | Zd518.3b" 
is elegans 

Homo sapiens Human partial complement 



TSoT 



951 



Y53629 



Homo 

sapiens 

Homo sapiens 



factor H protein fragment 3 
Novel protein C Clone 
BM46 10) . 



155 



279! 



D87438 



Homo 
sapiens 



AF24046 8 
Z15005 
M59216 



Homo sapiens 
Homo sapiens 



Y73467 



"459 I W67824 



"460 AF163151 



D87446 



G04044 



AC002398 



AF064856 



AF223408 



Homo 
sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Rattus sp. 



Homo sapiens 



A bone marrow secreted 
protein designated BMS115. 
Similar to a C.eiegans ~~ 
protein in cosmid C14H10 
nicastrin 



2810 



4069 



CENP-E 



3687 
13305 



gamma-aminobutyric acid 
recepto r beta-1 subunit 
Human secreted protein clone 
yd61_l protein sequence SEQ 
ID NO; 156. 



2477 



Human secreted protein 
encoded by gene 18 clone 

HSLFM29 . 

dentin sialophosphoprotein 

precursor 



535 



Similar to a C.eiegans 
protein encoded in cosmid 
C27F2 (U40419) 



9196 



Human secreted protein, SEQ 

ID NO: 81 25. 
F25965 1 



486 



1018 



7acomp protein 



1845 



B99 



3686 



56 



29 



99 



47 



99 

Too- 



100 



42 



100 



63 



33 



99 



98 



81 



99 



33 



93 



85 



51 



45 



100 



49 



32 



99 



100 



100 



100 
"99 



100 



100 



100 



19 



99 



93 



100 



84 



99 



154 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


| SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


466 


AF223408 


Homo sapiens 


B99 


2878 


87 


467 


AF104415 


Mus musculus 


gene trap locus- 13 


6336 


91 


468 


U53450 


Rattus 
norvegicus 


Oun dimerization protein 1 
JDP-1 


196 


49 


469 


AL031297 


Homo sapiens 


dJ97P20.1 (novel gene) 


3564 


99 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-liJce protein 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX39. 


838 


100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 


3411 


100 


476 


D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAKl-binding protein 2 


3656 


100 


478 


AL031534 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


482 


40 


479 


L28125 


Podospora 
anserina 


beta transducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AJ23824 8 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z38061 


Saccharomyce 
s cerevisiae 


mal5, stal, len: 1367, CAI : 
0.3, AMYHJfEAST P08640 
GLUCOAMYLASE SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57527 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


487 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


555 


56 


489 


AL021918 


Homo 
sapiens 


b34I8.1 (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


" 490 


X53773 


Rattue 
norvegicus 


alpha- c large chain (AA 1- 
938) 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


1459 


59 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF22£6l4 


Homo sapiens 


f erroportinl 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222E13.1 (novel protein 
with some similarity to 
Drosophila KRAKEN) 


513 


96 


495 


AF036977 


Homo sapiens 


unknown 


1812 


100 


496 


U93564 


Homo sapiens 


p40 


133 


45 


497 

* 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126. 


357 


100 


498 


AF069781 


Drosophila 
melanogaster 


Bem46-like protein 


653 


43 


499 | 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 


98 


r A A 

500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mus 

musculus 


putative membrane -associated 
guanylate kinase 1 


205 


36 


~J \J ^> 


AC £OZo / *± 


noino sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


GS protein 


669 


100 


504 | 


AF208861 


Homo sapiens 


BM-019 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 


507 


X66285 


Mus musculus 


HC1 ORF 


115 


43 


508 


D00189 i 


Rattus 
norvegicus 


Na+ , K+-ATPase alpha-subunit 


5227 


99 
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TABLE 2 



PCTYUS00/34263 



SEQ 
ID 
NO: 


ACCESSION 1 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


509 


Y94971 | 


Hoino sapiens 


Human Q^PTp»t"pr3 rjrotein clone 
fal71JL protein sequence SEQ 
ID NO* 148 . 


2176 


"Too 


| 510 


AB019038 j 


Homo sapiens j 


beta -1,4 mannosyl transferase 


781 


77 | 


[ 511 


AB019038 J 


Homo sapiens | 


"hp»t-a-i 4 mannosvltransf erase I 


1347 


100 J 


j 512 


AB019038 ] 


Homo sapiens j 


t- ^ - 1 a mflnnnfl vl t ransf erase ! 


1520 


99 J 


513 


X84908 | 


Homo sapiens | 


phosphorylase kinase 


5729 


99 


j 514 


X52851 | 


Homo sapiens 


peptluyipAviyi J. s>oiiiciciov 


650 


76 j 


j 515 


AF186084 


Homo 
sapiens 


SpiuclIUal ytOWHl Lauuvjx 

repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


.Human secreceu ptou«sxii/ a »w 


505 


99 


517 


U04706 


Bos taurus 


1 50 kDa protein 


, 1749 


77 J 
100 1 


518 


600653 


Homo sapiens 


Human secreted pruue4.1i, o&w 
ID NO: 4734. 


530 




1 519 


AF161475 


| Homo sapiens 


| HSPC126 


1368 


100 


520 


Y99366 


Homo sapiens 


Human PR01475 (UNQ746) amino 
| acid seguence SEQ ID NO; 88. 


3394 


97 j 


521 


AF266852 


1 Homo sapiens 


PTPLA 


j 1295 


100 


522 


AE000995 


Archaeoglobu 
j s fulgidus 


chromosome segregation 
I protein (smcl) 


153 


20 [ 


! 523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 


605 


97 



525 



526 



527 



528 



529 



530 



531 



532 



533 



534 



535 



536 



537 



538 



539 



540 



541 



AJ223830 



ARE1 



W01535 



Rattus 

norvegicus _____ ..... 

Homo sapiens I Cellular homologue of the 



2950 



1276 



AF145658 



Drosophila 
melanogaster 



SV40 large T 
BCDNA.GH10229 



antigen. 



320 



AF112213 I Homo sapiens 



524 



D49387 



Homo 
sapiens 



Y30819 



Homo sapiens 



putative Rab5- interacting 

p rotein 

NADP dependent leukotriene b4 | 1616 
12 -hydroxydehydrogenase 
Human secreted protein ~~[ 328 

encoded from gene 9 . 



AL079335 THomo sapiens 



Y91506 



X76116 



X76116 



dJl32F2l.3 (72.1 KDa protein 
(DKFZP564A03 2, SBBI88) 
similar to mouse I FN- gamma 

induce MG11. ) 

Homo sapiens | Human secreted protein 

sequence encoded by gene 56 
SEQ ID NO: 179. 
Caenorhabdit | carrier protein (c2) 
is elegans 
Caenorhabdit 
is elegans 



1059 



1159 



576 



carrier protein (c2) 



506 



X12966 



Homo sapiens j 3 - oxoacyl - CoA thiolase 

propeptide (424 AA) 



1972 



Y09267 



Z11773 



D84224 



Homo sapiens j flavin -containing 

monooxygenase 2 

Homo sapiens ["SRE-ZBP 

Homo sapiens | methionyl tRNA synthetasT 



2486 



2201 
4741 



D84224 



D84224 



D84224 



Uomo sapiens | methionyl tRNA synthetase 
Homo sapiens | methion yl tRNA synthetase 
Homo sapiens \ methionyl tRNA synthetase 



3887 



2933 



4529 



J03244 



Bos taurus 



H+ ATPase 
3.6.1.3) 



31kDa subunit (EC 



848 



98 



83 



33 



79 



100 



32 



99 



98 



50 



50 



100 



100 



99 



99 
"99 



96 



99 



77 



542 



543 



544 



545 



Y92514 



Homo sapiens 



AF221712 



Homo 
sapiens 



Human QXRE-ll. 

Smad- and 01 f -interacting 
zinc finger protein 



AE000919 



Methanobacte 
rium 

thermoautotr 
ophicum 



conserved protein 



A06669 



synthetic 
construct 



preTGF-betal 



2301 



2151 



207 



2070 



L 



99 



61 



38 



99 
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TABLE 2 



T-S 

SEQ 
lu 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 




Hril^i <i U b 


Homo sapiens 


wsb-1 protein 


2275 


100 


548 


X60271 


Mus mus cuius 


c-rel 


2264 


74 


549 


AC016827 


Arabidopsis 
tnaliana 


putative GTPase 


810 


42 


bbU 


X7U4G0 

... - . ,_ 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


68 


bbl 


J\3\J%oj 6 b 


Homo sapiens 


NEDD4-like ubiquitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4. 


1112 


95 


553 


AF119855 


Homo sapxens 


PR01847 


265 


67 


554 


M17236 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 


AL078468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
\PID:g4650844) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJ00086 protein 


1623 


98 


rr ft 

558 


M12140 


Homo sapiens 


pol gene protein; Xxx 


117 


48 


559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 


225 


56 


560 


v r* ^ r ft - 

X56681 


Homo sapiens 


junD protein 


373 


88 


561 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2926 


54 


562 


AL10983 9 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA.GH098l7 


289 


42 


CCA 
564 


AF052723 


Feline 
leukemia 

.V13TUS 


gag-pol precursor polyprotein 
gPr80 


154 7 


43 


crc 

3 D 3 




womo sapiens 


tip T"i/-i i o -> 


439 


44 


3 D O 


I c u O J. A 


jiomo sapiens 


pcJ2o^4 secreted protein. 


333 8 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


C Q 


ivr lbbllj 


Homo sapiens 


NY-REN- 55 antigen 


3603 


93 


570 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 


571 


AL032821 


Homo sapiens 


dJ55C23.1 (vanin 1) 


1821 


98 


572 


M69181 


Homo sapiens 


non-muscle myosin B 


7350 


99 


573 


M69181 


Homo sapiens 


non- muscle myosin B 


7311 


98 


574 


Y59678 


Homo sapiens 


Secreted protein 108-008-5-0- 
E6-FL. 


772 


100 


b /b 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


bYO 


AL3 65234 


Ar abi dop s i s 
thaliana 


putative protein 


788 


40 


D f / 


AUb /4b 


Homo sapiens 


DNA polymerase alpha -subunit 
(AA 1 - 1462) 


7619 


99 


b / p 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


b / y 


no /r q a /t 
Uob 7 0 4 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


con 
b o U 


Atf lbbl24 


Homo sapiens 


gamma- ami nobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


C O 1 

b»3 


P92219 


Homo sapiens 

^ IlLltllCtXi. f 


CR1 protein. 


11425 


99 


584 


AJ22394B 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


3874 


99 


586 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
lv310_7. 


1007 


37 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


98 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH - 
WATERMAN 
SCORE 


IDENTITY 


r— f\ f> 

588 


AF1317 7d 


riomo s«pitsii» 


Tin If Tirjwri 


1929 


99 


r~ f\ f\ 

589 


AJ250865 


Homo sapiens 




2348 


100 


591 


Z98885 


Homo sapiens 


dJ522J7.2 (bromodomain- 
rnntainino 1 (similar to 
Derearin, BR140) ) I 


4167 


100 


592 


1,76571 


Homo sapiens 


nuclear hormone receptor j 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 j 


9054 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a ] 


4443 


100 


595 


AL137802 


Homo sapiens 


dJ798A10.l (novel protein) j 


212 


55 


596 


AL022329 


Homo j 
sapiens 


bK407F11.2 (adrenergic, beta, 
receptor kinase 2) j 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein . 

PtT — - — 

[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


MXltTlcin. II^JL WIcA-L uvai-iaii uaocpug; j 


1574 


99 


600 


L36531 


Homo sapiens 


integrin alpha 8 subunit j 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 

encOQeoi Dy yent; «u- 


895 


100 


602 


AF2185B4 


Homo sapiens 




3265 


100 


603 


Y13115 


Homo sapiens 


serine/ cnreomnc pxrucciiA 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


QuiyoDlZ . 1 \]b.±AA.\J i to 1 


2413 


99 


605 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen 

t> vi n Viol A -ronoah 

triple neux repeat, 
conta ming proucj-u/ 


1979 


100 


606 


Y14494 


Homo sapiens 


araiari 


3465 


99 


607 


AJ001981 


Homo sapiens 


0XA1L 


j 2603 


100 


608 


X86098 


Homo 
sapiens 


OinQS Qirecciy t.o auenuvnuB 


2069 


100 


610 


AF163572 


Homo sapiens 


rorssiuaii giycuiA^iu 


1865 


99 


611 


AF161503 


Homo sapiens 




1261 


97 


612 


L41834 


isnsis minor 




345 


30 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


r4.T^c:^n ft "I {KTAA0027) 


361 


94 


\ 615 


X85786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


jcincsin-z 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3609 


97 


618 


U28789 


Mus musculus 




5936 


89 


619 


Y35914 


Homo sapiens 


ciXuenQeu nuuian octicucu 

nmt- #a ^ r» RPfTHPTlCe . SEO ID NO. 

163. 


1684 


99 


620 


AB046382 


MUS muscuius 


foah i o-ahimdant fmasr 
L e 9 u i a auuuuaii l> - 1 - ^-j. 

protein 


199 


23 


621 


Y00062 


Homo sapiens 


precursor poi.ypcpt.j.vAc \xxr» ^ 


3440 


99 


622 


AF068286 


Homo sapiens 


rHJ\-.riJJJJ Or 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


D ^ t 


X6110Q 


Homo saoiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14968 


Homo sapiens 


RII -alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7 1 derived protein 


1983 


100 



158 
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TABLE 2 



| SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 

* 


SMITH- 
WATERMAN 
SCORE 


k 

IDENTITY 


629 


Y50911 


Homo sapiens 

• 


Human fetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AF098786 

■ 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


1754 


100 


j 631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67)) 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/ threonine protein 
kinase 


1589 


100 


636 


Y11234 


Homo sapiens 


AFX1 


2571 


98 


637 


AB004884 


Homo sapiens 


PKU-alpha 


3718 


99 


638 


AJ002303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


AJ002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded, in cosmid 
T26A5 . 


2676 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi- 2 protein 


| 10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homolog 


827 


96 


: 649 


AF236061 


Oryctolagus 
cuni cuius 


RING- finger binding protein 


3830 


91 


650 


AL034553 

* 


Homo sapiens 


dJ9l4P20.2 (KIAA0784 protein 
similar to Mus musculus 
ac t ivi ty- dependent 
neuroprotective protein 
(Adnp)) 


5708 


100 


| 653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


2388 


99 


654 


AC004614 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


655 


Y57908 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


656 


Z34975 


Homo sapiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dJ475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus musculus 


dysferlin 


4752 


59 


663 


AF182316 


Homo sapiens 


myoferlin 


6232 


99 


665 


AL161516 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y13355 


Homo sapiens 


Amino acid sequence of 
protein PRO220. 


3692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


52 


671 


X56123 


Mus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


675 


L14463 


Rattus 


1 transducin 


3619 


92 
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TABLE 2 



678 



679 



682 



683 



684 



685 



686 



687 



ACCESSION 
NUMBER 



AF271388 



AF118566 
Y51415 



AL133545 



SPECIES 



norvegicus 



DESCRIPTION 



Homo sapiens 



Homo 
sapiens 



Homo sapiens 



R32611_l _^ 

reverse transcriptase 
homolog=pol {retroviral 
element } 



CMP-N-acetylneuraminic acid 
synthase 



protein 

Human wild type pKe83 
protein . 



SMITH - 
WATERMAN 
SCORE 



2273 



Y86214 



Homo sapiens 



DA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 



Y94952 



AL021878 



AE000198 



M58378 



Homo sapiens 



Homo sapiens 



Escherichia 
coli 



Nuclear transport protein 
clone hfb34l protein 

sequence. 

Human secreted protein clone 



5888 



£hll6_ll protein sequence 
SEQ ID NO: 110. 



dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2)) ____ 



154 



628 



3730 



508 



IDENTITY 



100 



100 



68 



99 



67 



100 



689 



690 



691 



692 



"699 



701 



702 



703 



704 



705 



706 



707 



708 



709 



710 



711 



U09355 



AF155106 



Homo sapiens | NY-REN- 3 6 antigen 



265 



AC004774 



Homo 



Dlx-5 



X90530 



Homo 



1926 



X90530 
X90530 



1405 
T590" 



695 I G01563 
"696 AC011810 



Homo sapiens 

Arabidopsis 
thaliana 



"697 I AJ250425 



"698 AB037901 



Human secreted protein, 
ID NO: 5644 . 
Putative methionine 
a minopep t ida s e^ 
Collybistin I 



669 



Rattus 
norvegicus 
Homo Tgene 

sapiens | cell _____ 

Homo sapiens Human PR01327 



amplified in squamous 
carcinoma -1 



2455 



5364 



acid sequence SEQ ID 



amino 
NO: 218. 



AF221712 



Homo 
sapiens 



X83573 



Homo sapiens 



Smad- and Olf- interacting 

zinc finger protein 

ARSE ~ 



6705 



3184 



AJ243274 



Y71262 



Y71262 



Y41257 



AL022237 



AJ006266 



H omo sapiens [ AP-2rep protein 

Homo sapiens Human chondromodulin-like 



2078 



1597 



protein , Zchml . 



Homo sapiens | Human chondromodulin-like 

prote in, Zchml. 

Homo sapiens Amino acid sequence ot long 



1736 



10S0 



human FAIM. 



Homo sapiens 



bKH9lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 

1) ) 



2030 



G01571 



Y08698 



Y68770 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Human secreted protein, SEQ 
ID NO: 5652. 

ranbp3 _ 

Amino acid sequence or a 
human phosphorylation 
effector PHSP-2 . 



2849 



754 



50 



99 



52 



98 



99 



100 



99 



99 



94 



99 



100 



98 



99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


59 


i 713 


ACO 04531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 


Homo sapiens 


DA311P8.3 (probable uracil 
phosphor ibosylt ran f erase) 


862 


100 


717 


AB035123 


Mus mus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 


Homo 

sapiens] 

>W41564 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain . 

• 


1591 


1 99 

» 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC0067O8 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
<GB;Z728 76) 


1143 


46 


726 


AC006708 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


9B8 


46 


727 


AC024818 

• 


Caenorhabdit 
is elegans 

■ 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat), score«81.8, 
E«1.4e-20, N=3 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 


733 


602650 


Homo sapienB 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


73 5 


AL035461 


Homo sapiens 


dJ967N21.6 {novel CDP-alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 j 
protein 


605 


41 j 


737 


AF07909B 


Homo 
sapiens 


arginine-tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 
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TABLE 2 



SEQ 
ID 
NO: 



738 
739 



740 



741 
"742 



743 



744 



745 



746 



747 



74B 



749 



750 



751 
"752 



755 



758 



759 



760 



761 



762 



763 



764 
"765" 



766 



767 



768 



769 



770 



771 
772 



773 



774 



776 



777 



778 



779 



780 



781 



782 



783 



784 



ACCESSION 
NUMBER 



AJ131712 



AJ133115 



SPECIES 



DESCRIPTION 



X98258 



X98258 



U97191 



X76057 



G03209 



X97064 



W93946 



Y73338 



M19529 



AJ249457 



AC00441Q 
AF074968 



AF252284 
AB049629 



"754 D79205 



Homo s apiens | nucleolar RNA-helicase 
Homo sapiens [ TSC-22-like protein 



Homo s apiens | M-phase phosphoprotein 9 
Homo sapiens | M-phase pho sphoprotein 9 



Caenorhabdit strong similarity to the yfti 
is elegans | sub- family of RAS proteins 
Homo sapiens | phosphomannose isomerase 



Homo sapiens 



Human secreted protein, SEQ 
ID NO: 7290. 
Homo sapiens | Sec23 protein 



Homo sapiens Human regulatory molecule 

H RM-2 protein 

HTRM clone 3376404 protein 

seq uence- _ 

follistatin A " " 



Homo sapiens 



SMITH- 
WATERMAN 
SCORE 



2793 



2054 



953 



564 



960 



2191 



496 



4034 



994 



1565 



Sus scrof a 

Trichomonas centrin, putative 
vaginalis 

Hom o sapiens | ros39554_l 
Homo sapiens | p47lNG3 p rotein 

. * ■ ■ 1 " " 



1906 

TIT 



2094 



2167 



Homo sapiens | transcription specificity 

fa ctor Spl 

Homo sapiens I phospholysine 

phosphohistidine inorganic 
pyrophos phate phosphatase 
ribosomal protein L3 9 ~~ 



4005 



1375 



ABOOB430 



1*32162 



AF037204 



Homo sapiens 
Homo sapiens 



160 



Homo sapiens 



CDEP _____ 

transcription factor 



142 
"574 



Y44250 



AF218586 



U38934 



AF226053 



X13403 



D87446 



AL023828 



Y82777 



X92475 



Y42752 



X51416 



Homo sa piens | RING zinc tmger protein 

Human cell signalling 



295 



Homo 
sapiens 
Homo sap 



625 



.ens 



Gallus 
gallus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



protein- 13 . 
Cide-b 



1136 



hi st one H2A 



625 



HSKM-B 
Oct-1 protein 



606 



(AA 1 - 743) 



3626 



Caenorhabdit 
is elegans 



Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



AJ006591 



AO 86 95 



Z12173 



Y91950 



ALQ23799 
AL02379 9 
G01880 



AJ012590 



Ho mo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo 
Homo 



sapiens 
sapiens 



AL078582 



Z75955 



AL109965 



AF061262 



G03873 



Homo sapiens 



Homo sapiens 
Caenorhabdit 
is elegans 



Homo 
sapiens 



Mus 

musculus 



Homo sapiens 



Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 



568 



Y17G7B.14 



200 



Human chordin related protein 
(Clone dw665_4) . 



2551 



ITBA1 _ T __ 

Human calcium binding protein 



1429 



1426 



3 (CaBP-3). 

hormone receptor hERRl (AA 1- 
521) 



2641 



cysteine-rich protein 



1793 



rap2 

N-acetylglucosamine-6- 

sulphatase 

Human cytoskeleton associated 

protein 5 (CYSKP-5) . 

dJ322P7.1 (zinc finger) 



935 



2970 



565 



855 



dJ322P7.1 (zinc tmger; 
Human secreted protein, 
ID NO: 5961. 



SEQ 



855 
849 



glucose 1- dehydrogenase 



4155 



dJ1 30E4 .2 (KIAA0796) 
similar to mitochondrial 
carrier protein 



1321 



dJ1121G12.2 (SCAN domain- 
contain inglproteiiO 



semaF cytoplasmic domain 
associated protein 2 



Human se creted protein, SEQ 
162 



384 



900 



1316 



649 



% 

IDENTITY 



100 



99 



100 



74 



85 



100 



98 



99 



100 



100 



100 



100 



99 



77 



54 



100 



100 



97 



32 



100 



38 



27 



99 



100 



100 



97 



100 

Too" 



100 



43 



56 



99 



68 



34 



100 



83 



95 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORB 


% 

IDENTITY 

• 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence . 


1048 


99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


78B 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


[ 789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ005710 


Rattus 
norvegicus 


phosphatidyl inositol 3 -kinase 


4508 


97 


792 


V00638 


bacceriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Hunting tin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desraoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


5080 


99 


797 


U151S5 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenor habdi t 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 | 


800 


AF234765 


Rattus 
norvegicus 


serine- arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


'< 802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


! 803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662d!2.5 
comes from this gene 


152 


27 

i * 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194 . 


496 


98 


805 


AL121S73 


Homo sapiens 


OA305P22.1 (novel protein) 


1160 


100 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


808 


AB013885 


Homo sapiens 


beta -ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 

* 


813 


Z73497 


Homo sapiens 


CU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFTl 9 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus mus cuius 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domes tica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Schizosaccha 


caffeine-induced death 


184 


29 



163 
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TABLE 2 



SEQ 
ID 

NO: 



825 



826 




828 



829 



830 



832 



833 



834 



835 



836 



837 



838 



"839" 



840 



841 



842 



843 



844 



845 



847 

Tab 



849 
"850" 



851 



852 
853 



854 



855 



856 



857 



856 



859 



860 



861 



862 



863 



864 



ACCESSION 
NUMBER 



AJ006692 



U23037 



Y30827 



Y32199 



W78279 



AB011542 



G02639 



AF119664 



AF119664 



AF119664 



X12517 



U32865 



AF067730 



SPECIES 



rorayces 
pombe 



Homo sapiens 
Oryctolagus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Drosophila 
melanogaster 



U27831 



AF286366 



G02309 



AE003615 



G01350 



U27838 



Y87788 
AF164794 



U41315 



AF192784 



Y58I28 



Z22968 
Z22971 



G03362 



G03362 



AF285118 



AC006069 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



protein 1 



ultra high sulfer keratin 



eIF-2Bepsilon 



Human secreted protein 
encoded from gene 17. 



Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 



Fragment o£ human secreted 
protein encoded by gene 33 



MEGF9 



Human secreted protein, SEQ 
ID NO: 6720 



transcriptional regulator 
protein HCNGP 



transcriptional regulator 
protein HCNGP 



transcriptional regulator 

protein HCNGP p 

C protein (AA 1-159) 



SMITH- 
WATERMAN 
SCORE 



693 



3406 



113 



1012 



1264 



2097 



223 



1574 



1144 



1448 



918 



linotte protein 



164 



TLS-associated protein TASR-2 ) 631 



s 



triatum- enriched phosphatase | 2840 



CamKI-like prote in kinase" 

SEQ 



Drosophila 
melanogaster 



Human secreted protein, 
ID NO: 6390. 
ade3 gene product 



1796 
278 



113 



Homo sapiens 
Mus mus cuius 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



AL021546 



L02956 



AF201947 



L31783 



AF161472 



Z49068 



AF154108 



Homo sapiens 

Arabidopsis 

thaliana 



Homo sapiens 



Xenopus 
laevis 



Homo sapiens 
Mus musculus 



Homo sapiens 



Caenorhabdit 
is elegans 



Homo sapiens 



Human secreted protein, 
ID NO: 5431. 



SEQ 



629 



glycosyl -phosphatidyl - 



"3305 



inositol -anchored protein 

homolog 

Human RBP-26 protein 
Diff33 protein homolog 



2026 
2398 



ZNF127-Xp 
makorin 1 



2458 



2062 



£ 

IDENTITY 



Protein regulating gene 
expres s ion PRGE-21. 



1548 



M13 0 antigen 
M130 antigen extracellular 
variant 



6205 
6380 



Human secreted protein, SEQ j 330 

ID NO: 7443 . 

Human secreted protein, SEQ 
ID NO: 7443. 



203 



CGI-203 



452 



putative cleavage and 
polyadenylation specifity 
factor 



1383 



Cytochrome C Oxidase 
Polypeptide Via -liver 
precursor (EC 1.9.3.i) 



593 



ribonucleoprotein 



1664 



MEK binding partner l 
uridine kinase 



616 
1266 



HSPC123 



602 



mitochondrial carrier protein 3 7U 



tumor necrosis factor type 1 ( 3559 



68 
90 



100 



99 



100 



70 



100 



89 



94 



100 



24 



56 



98 
100 



98 



48 



100 



"96 



100 

Too 



93 



97 



100 



96 



100 



100 
"55~~ 



100 



85 



100 



92 
73 



43 



99 



164 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








receptor associated protein 






865 


AE001530 


Helicobacter 
pylori J99 


putative 


230 


32 


866 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


699 


91 


867 


AL031673 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 


869 


AF192968 


Homo sapiens 


high-glucose- regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


99 


871 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


B72 


AF151534 


Homo sapiens 


core hi stone macroH2A2.2 


1866 


100 


873 


AL021331 


Homo sapiens 


dJ366N23.1 (putative C. 
elegans UNC-93 (protein 1, 
C46F11.1) LIKE protein) 


1129 


100 


874 


X14608 


Homo sapiens 


propionyl-CoA carboxylase 


3579 


100 


875 


AL117334 


Homo sapiens 


dJ687Fll.i (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


306 


100 

# 


876 


X79489 


Saccharomyce 
a cerevisiae 


E-925 protein 


446 


35 


877 


Y53001 


Homo sapiens 


Human secreted protein clone 
dn834_ 1 protein sequence SEQ 

ID NO: 8. 


811 


100 


878 


AF281064 


Homo sapiens 


CHMP1 . 5 


957 


100 


879 


X79417 


Sus scrofa 


40S ribosomal protein S12 


687 


100 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


478 


28 


881 


Y87275 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52. 


2547 


100 


882 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 


AB041261 


Homo sapiens 


calcium- independent 
phospholipase A2 


2903 


100 


884 


AF020313 


Mus musculus 


proline -rich protein 48 


999 


84 


885 


Y10936 


Homo sapiens 


hypothetical protein 


1104 


99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


866 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 

HTMPN-17. 


1099 


94 


888 


AL117635 


Homo sapiens 


hypothetical protein 


929 


99 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


2046 


99 


890 


Y36031 

• 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 

* 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 


AF237631 


Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 


1798 


100 


893 


AF090929 


Homo sapiens 


PJRO0477p 


653 


99 


894 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


3196 


100 

• 


895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 

r 


96 


896 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


897 


AE003551 


Drosophila 
melanogaster 


CGI 8 176 gene product 


633 


33 



165 
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TABLE 2 



898 



899 



900 
~90T 



902 



903 



904 



905 



906 



910 



911 



912 



913 



914 



915 



916 



918 



919 



922 



923 



924 



925 



926 



927 



928 



930 



931 



ACCESSION 
NUMBER 



AJ237946 



Z97184 



Z97184 



AJ245587 



AF091034 



R95953 



L04733 



AE003540 



M55542 



M55542 
W84085 
AF168676 



AB029150 



G02871 



G03162 



SPECIES 



DESCRIPTION 



Homo sapiens j DEAD Box Protein's" 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



2443 



624 



Homo sapiens 1 HKE2 



Homo sapiens | Kruppel-type zinc finger 



1942 



Homo sapiens | GTP -binding protein RAB22A 



1011 



Homo sapiens Eukaryotic cell growth 

inhibiting factor. 



414 



Homo sapiens 



1936 



Drosophila 
melanogaster 



CG10984 gene product 



446 



Homo sapiens guanylate binding protein 

isoform I 

guanylate binding protein 



2993 



Homo sapiens 

Homo sapiens 

Homo 
sapiens 



isoform I [ 

Human membrane fusion protein 1 18 85) 

WDProl . ^ 

TNF intracellular domain- 16 4 7 

interacting protein 



Homo sapiens j KRAB zinc finger protein 

HFB101L 



2196 



522 



ID NO: 6952. 



387 



ID NO: 7243 



AJ243721 



U24189 



Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 

sa piens _____ 

Caenorhabdit hypothetical protein 1207-1; 



244 



is elegans 



Y02591 



Method: conceptual 
translation supplied by 

authors 

Homo sapiens | A human progesterone receptor | 843 

* complex p23-llke protein. 



AE000984 



M23159 



Archaeoglobu dinitrogenase reductase 
s fulgidus j activating glycohydrolase 

' (draG) 

DHFR-coamplified protein 



Cricetus 
cricetus 



L12018 



Caenorhabdit | putative 
is elegans 



AF102177 
AL096712 



Hom o sapiens [ tumor antigen SLP-8p 
Homo sapiens dJ744I24.2 (similar to a 

novel human gene mapping to 

, Activator) 

AL161495 I Arabidopsis I putative WD- repeat protein 

thaliana | ^ 

AL161495 Arabidopsis putative WD-repeat protein 



1260 
1017 



"866" 



442 



thaliana 



U97001 



X71978 



Caenorhabdit similar to 

is elegans [ schizosaccharomyces pombe 

Mus musculus Fi£ ~" ~ 



1503 



M92288 



Y27575 



Y22499 



2249 



AJ224326 



sequence clone mh703_l. 
Homo sapiens ribulose-5-phosphate- 



912 



epimerase 



U28991 



Caenorhabdit | coded for by C. elegans cDNA 



660 



100 



100 



96 



98 



96 



100 



100 



100 



100 



87 



41 



99 



30 



41 



42 



95 
51 



100 



100 



166 
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TABLE 2 



SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 


25 


Q "i "3 

933 


GO 18 94 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 5965. 


767 


98 


a o a 
934 


AJ 2 /b485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


80 


93b 


ABUzbaU a 


Mus mus cuius 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2601 


99 


ft ft. 

! 938 


X65724 


Homo sapiens 


0RF2 


498 


100 


; 939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


14B7 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094583 


Homo sapiens 


putative HIV-l infection 
related protein 


452 


100 


942 


AC024200 


Caenorhabdi t 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


350 


69 




Ar 129756 


Homo sapiens 


G5c 


273 


100 


OA A 




Rat bus 

* 

norvegicus 


alpha- tropomyosin 


133 


96 


945 


AC009917 


Arabidopsis 
una ii ana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD021 protein 


551 


44 


OA ^ 


Ar Ubb4 73 


Homo sapiens 


GAGE -8 


273 


51 


948 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


AF143956 


Mus musculus 


coronin-2 


2300 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


1861 


99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


282 


67 


952 

* 


AB016881 


Arabidopsis 
thaliana 


gene_id : MXC1 7 . 7 ~ 


203 


46 


f\ ^ 

953 


Y01785 


Homo sapiens 


Human ubiqui tin -conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein. 


365 


100 


954 


AF145615 


Drosophila 
melanogaster 


BcDNA . GHO 3 377 


823 


46 


955 


U09410 


Homo sapiens 


zinc finger protein 2NF131 


2483 


99 


95b 


Tin C% A T ft) 

U09410 


Homo sapiens 


zinc finger protein ZNF131 


1853 


99 


OCT 


AF195623 


Homo sapiens 


cholmephosphotransf erase 1 
alpha 


2126 


99 




AS74317 


Drosophila 
meianogasuer 


head- elevated expression in 
0 . 9 Jcb 


155 


32 


959 


U54807 


Rattus 
norvegicus 


GTP-binding protein 


1167 


97 




Ar Uboo U / 


Bos taurus 


GTP -binding protein rah 


606 


97 | 


961 


G03244 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


963 


APQ01754 


Homo sapiens 


transient receptor potent ial- 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


CU1100H13.1 (putative novel 
protein) 


1129 


100 


965 


X61381 


Rattus 
rattus 


interf eron-induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1,4,5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ4 65N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 



167 
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TABLE 2 



SEQ 
ID 
NO: 



968 



969 



970 



971 



972 



973 



974 



975 



976 



977 



978 



979 



980 



981 



982 



983. 



984 



985 



986 



987 



988 



989 



990 



991 
~992 



993 



994 



995 



996 
997 



998 



999 



ACCESSION 
NUMBER 



U79275 



AJ011306 



AF281134 



U53336 



Homo 
sapiens 



Homo sapiens 
Caenorhabdit 
is elegans 



AC018749 



1000 



AF188504 



U25801 



AF049523 



1001 



1002 



1003 



AF161530 



1004 



1005 



G04020 



AF164797 



U94991 



S73775 



Y94888 



AJ243191 



X65020 



AJ249207 



Z30093 



Leishmania 
major 



Mus mus cuius 



Homo sapiens 



Homo sapiens 



protein HYPA/FBP11 



Homo sapiens 



Homo sapiens 
Homo sapiens 




Xenopus 
laevis 
Homo sapiens 



Homo 
sapiens 



HSPC182 [ 

Human secreted protein, SfcU 

ID NO: 8101. 

ribosomal pr otein L17 isolog 
transcription factor XLMOl 



calmitine; caisequescrine 



Human protein clone HP01462. 



Homo sapiens 1 heat shock protein 



Bos taurus 



PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 



AB030835 



AF227258 



AL02223B 



AL022238 



AF161426 



AF161426 
AF161426 



AL023859 



AL049631 



Rhodococcus putative racemase 

s p. AD45 _____ , . _ 

Homo sapiens | basic transcription factor 2, 

35 kD subunit 
Homo sapiens | contains two glutamine rich 

domains, three zinc- finger 
domains, and matrin 3 
homologous domain 3 (MH3) 



4697 



Bos taurus | RPGR- interacting protein-1 



Homo sapiens 



dJ1042K10-2 (supported by 
GENSCAN # FGENES and GENEWISE) 



4048 



Homo sapiens dJ1042K10.2 (supported by 

GENS CAN, FGENES and GENEWISE) 



HSPC308 



Homo sapiens 
Homo sapiens | HSPC308 

Homo sapi ens | HSPC30 _ _ 

Schizosaccha trna- splicing endonuclease 



448 



448 
"45T 



subunit 



AC005253 



AF265206 
AJ248285 



AE003641 



W69343 



AY007135 



Y73381 



AF208844 



AE004944 



AL031431 



S45367 



romyces 

pombe _________ - _ — 

Homo sapiens | dJ513M9.1 (novel Horaeobox 

domain protein) 
Homo sapiens I R26445 1 



172 



241 



902 



Homo sapien s 
Pyrococcus 
abyssi 
Drosophiia 
tnelanogaster 



MOG1 isoform A 

sar cosine oxidase, subunit 
beta (soxB) 



974 



195 



BG:DS00941.3 gene product 



218 



Homo 
sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 
Pseudomonas 
aeruginosa 



Homo sapiens 



Can is 
familiar is 



Secreted protein of clone 

CR930__1. 

similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102 .1 



1340 



1543 



HTRM clone 1877278 protein 
sequence . 



1668 



BM-002 

ivpothetical protein 



dJ462023.2 (nov el protein) 
centractin 



205B 
1949 



92 



92 
"92" 



42 



47 



100 



100 



28 



58 



98 



100 



100 



100 
"35" 



100 

Too" 



168 
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TABLE 2 



PCIYUS00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1006 


S45367 


Canis 

farailiaris 


cent r act in 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z63218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0333 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 374. 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule-associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


Bex- regulated protein j anus -a 


674 


100 


1021 


AF190625 


Coturnix 
cotumix 


qdgl-1 


636 


96 


1022 


AL133363 


Arabidopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U80736 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


6 85 


31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdi t 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306 . 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiqui tin- like protein 8 


331 


80 



169 
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TABLE 2 



SEQ 
ID 
NO: 



1040 



1041 



ACCESSION 
NUMBER 



AF290204 



SPECIES 



DESCRIPTION 



Homo sapiens blood group carrier molecule 



DOK1 



SMITH- 
WATERMAN 
SCORE 



1637 



IDENTITY 



99 



1042 
1043 



1044 



1045 



1046 



1047 



1048 



1049 



1050 



1051 



1052 



1053 



1054 



1055 



1056 



1057 



1058 



1059 



1060 



1061 
1062 



1063 



1064 



1065 



1066 



1067 



1068 



1069 



1070 



1071 



1072 



1073 



Y96730 



Homo 



PR0539, a Costal- 2 homologue. 



162 



1074 



1075 



AF140683 



AF151023 



sapiens ____ 

Mus mu sculus | F-box protein FWD2" 
Homo sapiens HSPC1B9 



2397 



1104 



AF181631 



Y77985 



AJ243972 



Drosophila I BcDNA.GH04929 
melanogaster ^ _____ 
Homo sapiens | Human collectin amino acid 

sequence . 

Homo sapiens 1 6-phosphogluconolactonase" 



204 



1940 



AB035863 



Homo sapiens | ATP speciric succinyl CoA 

synthetase beta subunit 
precursor 



AL034550 



AF163825 



Homo sapiens | dJ1184F4.2 (novel protein 

similar to nucleolar protein 
4 (NOL4) (NOLP)) 
Homo sapiens | pre-B lymphocyte protein 3 



AF201949 



Homo sapiens | 60S ribosomal protein L30 

isolog 



AF190624 



AE003529 



Mus musculus | mdgl-1 
Drosophila 
melanogaster 



CG6151 gene product 



G01191 



Homo sapiens | Human secreted protein, SEQ 

ID NO: 5272. 



AL162756 



Neisseria j~ Glu-tRNA(Gln) 
meningitidis amidotransf erase subunit A 



AF181856 



Rattus 
norvegicus 



tRNA selenocysteine 
associated protein 



U89649 



Chlamydomona Mrl9,000 outer arm dynein 
s j light chain 

reinhardtii 



AF159141 | Homo sapiens 



AF230929 



Homo 
sapiens 



breast cancer metastasis- 

suppressor 1 

keratinocyte annexin- 
protein pemphaxin 



AJ270952 j Homo sapiens | putative membrane protein 



1317 



2324 



981 



634 



868 



236 



160 



646 



682 



1525 



244 



663 



1710 



1363 



AF224263 
X63417 



Heterodontus 
f rancisci 



HoxD8 



AL079345 



Y71112 



Homo sapiens | IRLB 

Streptomyces | hypothetical protein 
coelicolor 
A3 (2) 

Homo sapiens j Human Hydrolase protein-10 

(HYDRL-10) . 



AF263614 



Y13356 



Homo s apiens | acetyl-CoA synthetase 
Homo sapiens Amino acid sequence of 



protein PR0221 . 



AC006153 I Homo sapiens 



similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 <PIP;g2984292) 



Y18930 



Sulfolobus 
solfataricus 



hypothetical protein 



R65969 



Homo 

sapiens T98G 



Glioblastoma-derived 
polypeptide. 



Y07964 | Homo sapiens | Human secreted protein 

fragment 

CDK5 activator-binding 



AF177476 



Rattus 
norvegicus 



protein 



AF245505 \ Homo sapiens | adlican 



U92794 



Mus musculus 



alpha glucosidase II, beta 
subunit 



G03889 



Homo sapiens 



Human secreted protein, SEQ 
ID NO: 7970. 



U15779 



Y13392 



Homo sapiens 
Homo sapiens 



p70 _ 

Amino acid sequence or 



742 



1037 



143 



2547 



3493 



1363 



662 



162 



887 



863 



1995 



3109 



147 



698 



380 



1271 



22 



98 



100 



37 



100 



100 



99 



92 



100 



100 



85 



44 



98 



44 



99 



34 



53 



99 



100 



83 



100 



27 



100 



99 



100 



98 



29 



100 



96 



86 



99 



36 



98 



28 



— 



170 
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TABLE 2 



f> T"l 

SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








protein PR03 28. 






1U /b 


AF161457 


— — , — 

Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate- associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 


1079 


AL132965 


Arabidopsis 
thai i ana 


putative WD-40 repeat-protein 


286 


29 


4 n n A 

1080 


Ti. ft ft Jl ft ft *1 

AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
> • • 
protein 


579 


100 


10 92 


AF016416 


Caenorhabdi t 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylargmine hydrolase 


802 


45 


1084 


AB041541 


Mus mus cuius 


unnamed protein product 


151 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human RNA-associated 
protein. 


2783 


100 


i ft ft ft 

1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


613 


100 


^ ft ft ft 

1090 


AK023982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


} 81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


U34973 


Mus musculus 


protein tyrosine phosphatase- 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


522 


56 


1095 


Y87276 

* 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide ! 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 


1 f\ ft T 

1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


i ft n ft 

1098 


U80029 


Caenorhabdi t 
is elegans 


similar to thioredoxin 


242 


39 


1099 


AJ005866 


Homo sapiens 


Sgv-7-like protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 

* 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2 .5) 


96B 


100 


1106 


U28016 


Mus musculus 


pa rat hi on hydrolase 
(phosphodiesterase) -related 
protein 


1624 


. 87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814, 


495 


98 


^ «t ft ft 

1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP - 7 . 


941 


48 


1111 


Y28921 


Homo j 
sapiens 


Human regulatory protein 
HRGP- 7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 



171 
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TABLE 2 



1115 



1116 



1117 



1118 



1119 



1120 



1121 



1122 



1123 



1124 



1125 



1126 



1127 



1128 



1129 



ACCESSION 
NUMBER 



AF229439 



L40357 



L40357 



A12155 



AL161542 



AL023754 



Y57901 



Z14122 
AP225418 



Y06518 



AL035690 



AJ000217 



AB030505 



Y73375 



Y78941 



SPECIES 



DESCRIPTION 



ID NO: 8120. 



Homo sapiens 



Homo sapiens 



Homo sapiens J Human X5L cDNA. 



Arab i dope is 
thaliana 



isomerase like protein 



Homo sapiens 



Homo sapiens 



dJ272L16.1 (Rat 
Ca2+/Calmodulin dependent 
Prote in Kinase LIKE protein) 
Human transmembrane protein 
HTMPN-25. 



Xenopus 
laevis 
Homo sapiens 



XLCL2 



Homo sapiens 



Homo sapiens 
Homo sapiens 



prot ein ZIP. 

dJ202I21.1 (novel protein) 



SMITH- 
WATERMAN 
SCORE 



1697 



404 



607 



2341 



321 



1531 
3227 



952 



1286 



UBE-1C2 



1069 



Homo sapiens 



874 



Homo sapiens 



Cyclophilin-type peptidyl 
prolyl cis/trans isomerase 
amino acid sequence. 



877 



% 

IDENTITY 



98 



36 



100 



99 



79 



100 



100 



1130 


AL023553 


Homo sapiens | 


dJ347H13 .4 (novel protein) | 


557 1 


100 

i r»n 1 


1131 | 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1408 ! 


1UU 1 


1132 | 


Z68197 


Schizosaccha 
romyces 
pombe J 


putative nuclear pore protein T 


r- ft fZ i 

596 j 


1 


1133 


Z68197 


Schizosaccha 
romyces 
pombe | 


putative nuclear pore protein j 


389 ! 


35 | 


1134 


AF180681 


Homo sapiens j 


guanine nucleotide exchange 
factor 


3597 


100 | 


1135 } 


AF079765 


Mus mus cuius | 


enhancer of polycomb \ 


264 j 


41 j 


1136 


M62419 


Mus mus cuius | 


clathrin-associated protein 


2189 j 


99 J 


1137 ! 


AJ006219 


Drosophila 
melanogaster | 


clathrin-associated protein I 


1254 1 


78 1 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 | 


98 


1139 


W88104 


Homo 1 
sapiens 


A Rab protein designated j 
HRABS-2. 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


"98 


1141 


W85026 


Chimeric - 
Homo sapiens 


; Green fluorescent protein- 
Zap70 fusion product. 


3309 


100 


1142 


Y13402 


Homo sapiens 


Amino acid sequence of 
1 protein PRO310. 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
1 ID NO: 7956. 


660 


99 


I 1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
j human secreted peptide. 


i 750 


98 


I 1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
| human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
j (PROTEIN DXF34) ) 


1233 


100 


! 1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
j (PROTEIN DXF34 ) ) 


1233 


100 


| 1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
| ID NO: 6629. 


370 


98 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
j sequence. 


1492 


TToo 


1150 


| W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


( 228 


55 



172 



WO 01/53312 PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








HEAAR60. 






1151 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
acyltransf erase-gammal 


1855 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A094 6, Em : AL050069) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein SEQ 
ID NO: 8117 . 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


L-asparaginase 


287 


43 


1158 


AF151848 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 

• 


Human signal peptide 
containing protein HSPP-107 

SEQ ID NO: 107. 


746 


83 


1163 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJH91N16.1 (A novel protein 
(translation of the cDNA 
DKFZp5 66 A094 6, Em:AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


945 


76 


1 1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phospholipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 


Saccharomyce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ"1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V- alpha -J- 
alpha region 


284 


83 


1177 


AC01268O 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 j 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94 974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 ] 
polypeptide . 


T cell leukemia/ lymphoma 1 


617 


100 



173 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 



1183 



1185 
1186 



1187 



1188 



1189 



1190 
1191 



1192 



1193 



1194 



1196 



1197 



1198 



1199 



1200 



1201 
1202 

T20T 



1204 



1205 



1206 



1207 



1208 



1209 



ACCESSION 
NUMBER 



U42841 



1210 



SPECIES 



[Homo 
sapiens 



Caenorhabdi t 
is elegans 



AJ131613 



1211 



1212 



1213 



1214 



L.27645 



Homo sapiens 



Danio reno 



Y02738 



Homo sapiens 



1215 



AF217544 



1216 



Xenopus 
laevis 



AL136307 I Homo sapiens 



X89602 



U32828 



H omo sapiens 
Haemophilus 
influenzae 
Rd 



AF154831 



Rattus 
norvegicus 



Y50926 



Homo sapiens 



AF026530 



U35244 



Rattus 
norvegicus 
Rattus 
norvegicus 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



short region of weak 

similarity to collagen 

dicarboxylate carrier protein 



161 



1470 



growth- associated protein 



130 



Human secreted protein 
encoded by gene 89 clone 

HLHFP03 . 

ornithine decarboxylase- 2 



636 



1459 



dJ380B8 .2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 
rTSbeta " 



182 



197 



ribosomal protein S6 
modification protein (riraK) 



268 



PV-1 



1403 



Human fetal brain cDNA clone 
vcl6_l derived protein. 



918 



stathmin-like -protein splice 

variant RB3 ' | 

vacuolar protein sorting 
homolog r-vps33a 



1093 



2981 



Y70470 



Homo sapiens 



Human p53 target molecule, 
PRG3 protein. 



1680 



AF157318 1 Homo sapiens 1 AD- 017 protein 



912 



AF125443 



Caenorhabdit contains similarity to S. 
is elegans j pombe phosphatidyl synthase 

(GB:Z28295) 



460 



AF201934 
AL031775 



Homo 
Homo 



sapiens 
sapiens 



M21103 
Z85986 

U18762 



Ovis 
Homo 



aries 
sapiens 



Rattus 
norvegicus 



DC12 

1J30M3.3 (novel protein 
similar to C. elegans 

Y63D3A.4) 

BIIIB4 high- sulfur keratin 

ri08K11.3 ( similar to yeast 
suppr e s s or protein SRP40) 
retinol dehydrogenase type I 



484 
1143 



890 



U35730 



Mus musculus I }erky 



2235 



AB002327 | Homo sapiens | KIAA0329 



151 



AB019233 



Arabidopsis ubiquinone/menaquinone 

thaliana | biosynthesis 

m ethyl transf erase-like 

dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 



762 



AL136307 I Homo sapiens 



742 



AF207989 Homo sapiens orphan G-protein coupled 

receptor 



2326 



Z97630 



Homo sapiens 



dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))) 



181 



U21549 



MU3 musculu3 | Ac3 9/physophilin 



1280 



Y27700 



Homo sapiens 



1267 



AF117814 TMus musculus 
AF277233 Naegleria 



Human secreted protein 

en coded by gene No. 12. 

odd-skipped related 1 protein"' 945 



calcineurin B 



222 



f owleri 



D14849 



Mus musculus 



G03022 



Homo sapiens 



meiosis-specilic nuclear 

structural protein 1 

Human secreted protein, SEg 
ID NO: 7103. 



1950 



590 



Z72510 



Caenorhabdit j similarity to yeast UTR3 



634 



33 



99 
36 



100 



60 



33 



100 



31 



60 



100 



31 



96 



100 



47 



39 



82 



75 



52 



T6 



24 



56 



100 



100 



44 



68 



100 



"66 



39 



77 



100 



49 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein {Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


Z49703 


Saccharomyce 
s cerevisiae 


unknown 


134 


22 


1218 


AC013430 


Arabidopsis 
thai i ana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdi t 
is elegans 


similar to vanadate 
resistance protein 
transmernbranous comes from 
this gene 


965 


58 


1221 


AL16381S 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein NY- REN- 21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma- 6 subunit 


356 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 • 


X640G2 


Homo sapiens 


RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalase 


2846 


100 

i 


1228 


AJ005620 


Mus mus cuius 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L08239 


Homo sapiens 


located at 0ATL1 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


| 1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdi t 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC006634 


Caenorhabdi t 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C <GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 

tyrosine-phosphorylated 

protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


- 100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 
(S5kDa) 


574 


48 


1244 


AP220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyltransf era 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


1250 

■ 


Y13148 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 


46 

♦ 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 


• 

ACCESSION I 


SPECIES 1 


DESCRIPTION 


SMITH- 1 
WATERMAN 
SCORE 


% 

IDENTITY 

J 


1252 


AF146738 j 


Rattus j 


testis specific protein 


771 


83 






Homo flani pn n 1 


Human secreted protein, SEQ 
ID NO: 6806. 


419 


97 


T O C A 




Unmn qani pti S t 


Human ubiguit in- conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 | 


Homo sapiens | 


BC41195 1 


831 


78 


1256 






mitochondrial methionyl- tRNA 
trans formylase 


1556 


88 

* 1 


1257 


?7 1 C A O/l 1 

43SU94 | 


tl^\m/ , i cam one 1 

nOluO Sapxclib | 




1354 


97 j 


1258 


Y13362 j 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


2383 


100 | 


1259 


AC006014 


Homo sapiens j 


similar t-o EFP transf ormincr 
protein; similar to P14373 


1299 


100 


1260 


ACO05O99 


Homo sapiens i 


malr-h to AI222572 
(NID -03 804775) 


469 


100 


1261 


V00507 


Homo sapiens 
* f 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 I 


Rattus sp. | 


gamma - glut amy 1 t ranspep t ida se 
(AA 1-568) 


697 


32 


1263 


AF173871 j 


Mus musculus j 


neuronal PAS 3 


977 


' 94 


1264 


AF178983 j 


Homo sapiens | 


Ras-associated protein Rapl 


433 


97 j 


| 1265 


Y70473 


Homo sapiens 


Unman evclic nucleotide - 
associated protein- 1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 td rote in 
sequence . 


1622 


100 


J jL^so / 


AF061346 Mus musculus 


Edol nrotein 


1077 


64 


12b a 


U97006 


Caenorhabdi t 
is elegans 


C13F10.4 qene product 


154 


23 


I i o c a 

1 1269 


AF233582 


Mus musculus 


GTPase Rab37 


942 


95 | 


| lz /U 


AF195951 


Homo sapiens 


sicrnal recoqnition particle 
68 


3127 


98 


1 1271 


AL031177 


| Homo sapiens 


dJB89Ml5.3 (novel protein) 


1150 


55 


1272 


AP2 01933 


| Homo sapiens 


DC11 


650 


100 j 


1 1273 


AF201933 


| Homo sapiens 


ncii 

1 V_* -L. J_ 


346 


98 


1274 


AL021710 


Arabidopsis 
thaliana 




348 


49 | 


| 1275 


AC0 04449 Homo sapiens 


1 In. J J U U-J J 


556 


100 


I ^ An/* 

1276 


Y86295 


Homo sapiens 


1 wnman secreted protein 
1 HL2AGB7, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hvdrolase protein- 9 
(HYDRL-9) . 


1576 


99 


12 /o 


S94421 


I Homo sapiens 


T cell receptor eta-exon 


478 


100 


1 -I O *7 O 


Y66695 


Homo 
j sapiens 


1 Membrane -bound protein 
j PR01344. 


1909 


100 


I i o o n - 
I 1 Z O U 


AF161380 


Homo sapiens 


1 HSPC262 


772 


100 


[ T)Q1 

1 Izol 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein 71 . 


779 


100 


1 1 *5 Q O 


AC015446 


Arabidopsis 
thaliana 


1 Similar to AIG1 protein 


406 

* 


35 




AK024432 


Homo sapiens 


FLJ00022 protein 


403 


35 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 
protein (FIP) . 


1825 


81 


1 1285 


AJ001019 


Homo sapiens 


1 ring finger protein 


1301 


100 


1286 


AE003823 


Drosophila 
melanogaster 


rnm7fl crene nroduct 


195 


29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
1 binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


j similar to MLN 64; similar to 
| 138027 (PID:g2135214) 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar co 
1 138027 (PID:g2135214) 


668 


93 


| 1290 


AB023811 


Homo sapiens 


1 TU3A 


351 


! 54 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1291 


Z73424 


Caenorhabdit 
is elegans 


C44B9.1 


235 

< 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF1B0425 


Homo sapiens 


retinoblastoma-associated 
protein RAP14 0 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 


99 


1295 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X57560 


Escherichia 
coli 


pspE protein 


535 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine-rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdi t 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic Jcruppel like factor 


1206 


100 


13 01 


X55989 


Homo sapiens 


eosinophil cationic-related 
protein 


737 


99 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


13 03 


X52904 


•Escherichia 
coli 


open reading frame (AA 1-65) 


359 


100 


1304 


U19577 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


932 


100 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 

* 


1308 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ21GBl.l (KIAA0680) 


267 


34 


1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


Caenorhabdit 
is elegans 


C47A4.1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 

* 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 | 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 

* 


467 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






retrovirus 








1324 


AL138655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thaliana 


putative procein | 




35 


1326 


AL133215 


Homo sapiens 


JjAIvJoXj / t & \ novel piutCJ.il 

similar to rat tricarboxylate 
carrier) _J_ 


1322 


99 n 


1327 


AF161541 


Homo sapiens 


HSPCOdd [ 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence. j 


"785 


96 


1329 


L10910 


TT r-- 

Homo sapiens 


splicing ractor | 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 I 


W87772 


Homo sapiens 


Human serum glucocorticoid- T 
regulated Kinase (H-buK^j 
polypeptide. j 


232 


39 


1332 


Y41741 


Homo 
sapiens 


Human PRO704 protein 

sequence . ( 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc -finger protein zbkiu. i 


ATT 
411 




1334 


Z82271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene _J 


578 


44 


1335 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


* 




1336 


Y68779 


Homo sapiens 


Amino acid sequence ot a 
human pnospnoryiation 
effector PHSP-11. 




91 

^ X 


1337 


AB027003 


Mus mus cuius 


protein pnospnacase 


3 78 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similar my tio itK 
domains 


^ X J 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein or tne imk/ ramny 




29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


204 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; ba^ya- 
67881 


9 ft 9 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


1 Ol 


t nn 

1UU 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

(PID:g4650844; 


f a □ /i 

j o y ^ 




1345 


AF257466 


Homo sapiens 


N-acetylneuraminic acia 
phosphate synthase 


1 lOOu 


99 


1346 


Y25696 


Homo sapiens 


Human secreted protein 
iragmenc encoaeo trotu geiic 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility prouein z-hkc 
protein 


i 1664 

1 X V \J Tt 


58 


1348 


AF161548 


Homo sapiens 


HSPCOo J 


1 1018 


98 


1349 


W78128 


Homo sapiens 


Human secreuea protein 
encoaea oy gene j cioiib 
HOSBI96 . 


1 1117 


100 


1351 


G02144 


Homo sapiens 


Human secreuea pjroucxii, ociu 

lu JNL/ . 0^^3. 


418 


100 


1352 


D90869 


Escherichia 
coll 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R26 66 0_1, partial CDS 


| 870 


74 


1355 


AC024876 


Caenorhabdi t 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 




61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus mus cuius 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


ZNF234 


1 3869 


100 


1361 


AL163279 


Homo sapiens 


homo log to cAMP response 


j 5035 


99 



178 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




- 




element binding and beta 
transducin family proteins 






1362 


248475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z48475 


Homo sapiens 


glucokinase regulator 


2682 


97 


1364 


AF195764 


Homo sapiens 


megakaryocyte -enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PR00915 


581 


100 


1366 


AF116609 


Homo sapiens 


PR00915 


581 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557) ) 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


Z98048 


Homo sapiens 

• 


dJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445 


Homo sapiens 


DOCl 


1645 


46 


1376 


AL117337 


Homo 
sapiens 


DA393J16.1 (zinc finger 
protein 33a (KOX 31) ) 


250 


60 


1377 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 ; 


1378 


U35113 


Homo sapiens 


metastasis-associated gene 


1823 


69 


1379 


L15313 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB03736O 


Homo sapiens 


ANKHZN 


959 


97 


1383 


AF237$7$ 


Mus musculus 


G beta- like protein GBL 


1721 


96 


1384 


AF237676 


Mus musculus 


G beta- like protein GBL 


1043 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-1. 


715 


100 


1386 


AF212162 


Homo sapiens j 


ninein 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC00489O 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 - 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


1396 


AC004 809 


Arabidopsis 
thaliana 


Similar to 

• 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC004472 | 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 



179 
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1404 



1405 



1406 



1408 



1409 



1412 



1413 



1414 



1415 
1416 



1417 
1418 



1419 



1420 



1421 



1422 



1424 



1425 



1426 



1427 



1429 



1430 
1431 



ACCESSION 
NUMBER 

Y79222 



1432 



X81058 



AB012084 



1433 



AB030251 



"1407 | AJ010585 



X75760 



U76618 



1410 I AC005578 



1411 I AE000284 



1436 



1438 



1439 



X01563 



W78279 



Homo 
sapiens 



Homo sapiens 



Rattu3 
rattus 



DESCRIPTION 



Human transferase tRNSFS-14. 



SMITH- 
WATERMAN 

SCORE 
2842 



IDENTITY 



tex261 



3233 
2684 



Drosophila 
melanogaster 



LRR4 7 



804 



835 



coli 



"Escherichia j L5 (rplE) (aa 1-179) 
coli 



AB031051 



Homo sapiens | Fragment ot human secretect 

protein encoded by gene 33 
Homo sapiens | organic anion transporter 

OATP-E 



3832 



M17466 
AF097994 



3455 



L- kynurenine/alpha- 
aminoadipate aminotransferase 



2202 



AF151077 
Y09945 



Homo sapiens 
Rattus 
norvegicus 



HSPC243 

putative integral membrane 
transport protein 



1262 
1098 



U13152 



Mesocricetus guanine nucleotide -binding 
auratus | protein beta 5 

AL162458 Homo sapiens 



2179 



DA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) > 



5696 



Y99426 



Homo sapiens 



Y94923 



"1423 | AF177388 



Homo sapiens 



Homo 
sapiens 



Human PRO1604 (UNQ785) amino 
a cid sequence SEQ ID NO: 308. 
Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO;52. 
cancer-amplif ie 
transcriptional coactivator 

ASC-2 



152 



4039 



10748 



Y4B517 



Human breast tumour - 
associated protein 62 



1454 



Homo 



853 



AF112886 



Bos taurus 



4693 



1428 I U41387 



1372 



AF161534 



2853 



AF12504 3 
Y66718 



Homo 
sapiens 



AF193613 



AB044560 



Homo sapiens 
Mus mus cuius 



1434 j R99800 



Homo sapiens 



1435 | AF220530 I Homo sapiens 



X70944 



Homo sapiens 



1437 AF271732 I Homo sapiens 



Y30B11 



Homo sapiens 



AJ293659 



1440 I AF219138 [Homo sapiens 



1441 AF219138 1 Homo sapiens 



bi sphosphate 3 ' -nucleotidase 
Membrane-bound protein 
PRO1106. 



cell recognition molecule 

Caspr2 

Gliacolin 



192 



NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 



myo- inositol 1-phosphate 
synthase Al 



2904 



bridging integrator- 3 



1282 



595 



628 



GGA3 long isoform 



100 



99 



48 



100 



100 



99 



61 



76 



100 



29 



99 



89 



79 



78 
TO 



100 



34 



72 



100 



180 
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SEQ 
ID 
HO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1442 


AB039669 


Homo sapiens 


ALEX3 


1944 


100 


1443 


AF237711 


Drosophila 
melanogaster 


Diablo 

• 


191 


27 


1444 


AJ011896 


Homo sapiens 


Nafl beta protein 


439 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


98 


1446 


AF214114 


Homo sapiens 


breast carcinoma-associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC_2H01 


2645 


99 


1448 


AF003136 


Caenorhabdi t 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY -REN- 50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ ID NO: 48. 


985 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


238011 


Mus musculus 


DMR-N9 


882 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT@EMBL-Heidelberg.DE 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mll.3 (similar to 
sialyltranf erase ) 


1356 


100 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


AF242552 


Gallus 
gallus 


retinovin 


945 


34 | 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 ■ 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodie s t e r a s e 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs Z43979 
(NID:g573097) , R19699 
(NID:g774333) 


869 

■ 


98 

■ 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:gS73097) , R19699 
(NID:g774333) 


869 


98 


1465 


U32743 


Haemophilus 

influenzae 

Rd 


fucose operon protein (fucU) 


315 


50 


1466 - 


Y09022 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 

i 


r ibulose - 1 , 5 -bi sphosphate 
carboxylase/oxygenase small 
subunit N-methyl transferase 1 


333 


26 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-54. 


1053 


100 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTSl 


1101 


50 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60 , SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdi t 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157> 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


U10536 


Pan paniscus 


MHC. class I A 


675 


84 



181 
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1481 



1482 



1483 



1484 



1485 



1486 



1489 



1490 



1493 



1494 



1495 



1496 



1497 



1498 



1499 



1500 



1501 

1502 
1503 



1504 



1505 



1506 



1507 
1508 



1511 



ACCESSION 
NUMBER 



AL078599 



Z98977 



AB005662 



AL050120 



M27878 



Y69161 



X84156 



AF038963 



U56966 



AE000989 



M80633 



Y73342 



Y17220 



AF133670 



Y94897 



AL049699 



AF037447 



AL445067 



SPECIES 



DESCRIPTION 



Homo sapiens | dJ991C6.1 (novel protein 

similar to C. elegans 
F55A12.9 (Tr:P9l086) ) 



Schizosaccha putative vacuolar protein 



romyces 

pombe ^ l _ 

Mus musculus | JNK/SAPK- associated protein- 1 



Homo sapiens | hypothetical protein 



Homo sapiens | Amino acid sequence of a 

partial protein kinase. 



Saccharotnyce 
s cerevisiae 
Homo sapiens 



RNA helicase 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



256 



716 



1006 



341 



446 



Caenorhabdit coded for by C. elegans cdna 
is elegans | yk30b3.5? coded for by C. 

elegans cDNA yk30b3.3 



Archaeoglobu | enoyl-CoA hydratase (fad-4) 
s fulgidua 



Rattus 
norvegicus 
Homo sapiens 



HTRM clone 2709055 protein 
sequence 



Homo sapiens 



Human secreted protein (clone 
fj283-ll). 



Mus musculus | ARL-6 



Homo 
sapiens 



Human protein clone HP10574. 



Homo sapiens 



Homo sapiens 



dJ747 H23.2 (novel protein) 
ribosomal S6 protein kinase 



AB039947 



AJ277750 



AL050333 

AF179896 
AF178948 



Y53005 



X82494 



X98296 



AL03454 8 
Y76144 



AF220182 
U64601 



AL356192 



Thermoplasma 
acidophilum 



Homo sapiens 



putative target YPL207w of 
the HAP2 transcriptional 
comp lex related protein 
XllL-binding protein 51 



Homo sapiens 



Homo 
sapiens 
Homo sapiens 
Homo sapiens 



UBASH3A protein 

dJ93K22.1 (novel protein 

(contains DKFZP564B116) ) 

TALE homeobo x protein Meis2 
TALE homeobox protein Meis2a 



Homo sapiens 



Homo sapiens 



Human secreted protein clone 
pm749_8 protein sequence SEQ 

ID NO:16. 

ribulin-2 ~ 



Homo sapiens 



Homo sapiens 
Homo sapiens 



dJ1 103G7.6 (novel prote in) 
Human secreted protein 
encoded by gene 21. 



620 



533 



3513 



462 



701 



1371 



1550 



2427 



269 



227 



3509 



2439 



1140 

Ti7T 

1442 



3580 



783 



Homo sapiens 

Caenorhabdit 
is elegans 



uncharacterized hypothalamus 

protein HT008 

Gene probably begins in the 
next cosraid 



Neurospora 
crassa 



related to MDMl protein 



1098 
1736" 



415 



196 



65 



29 



92 



100 



99 



29 



34 



42 



46 



95 



99 



37 



97 



100 



100 



100 



35 



36 



100 



100 



99 



99 



42 



58 



1512 



1513 



1514 



1515 



1516 



D17629 



Homo 
sapiens 



N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 



AF168717 



AJ243531 



Homo sapiens 



AC003672 



Arabidopsis 
tha liana 



AF115435 



Rattus 
norvegicus 



putative C3HC4-type RING zinc 
finger protein 



407 



syntaxin 17 



30 



1517 



1518 



1519 



AF003140 



Caenorhabdit 
is elegans 



C44E4.5 gene product 



AB002584 



Rattus 
norvegicus 



beta- alanine -pyruvate 
aminotransferase 



AL121764 



Schizosaccha 



yeast atpl2 protein precursor 



270 



182 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


1 SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






romyces 
pombe 


homo log 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 ! 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane- bound protein 
PRO190. 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsis 
thai i ana 


F17F8.22 

■ 


277 


37 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase -like 
phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ00012 protein 


611 


100 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidase 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 

■ 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 
RanBP6 


5707 


99 


1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.9 


374 


37 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3 693 


99 


1538 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


| 1539 


AF266756 


Homo sapiens 


sphingosine kinase 


2011 


99 


! 1540 


Z48804 


Homo sapiens 


OA1 


2238 


100 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value«l . 9e-05, 
N«l 


379 


42 


1542 


Y71159 


Homo sapiens 

• 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


99 


1543 


X76092 


Komo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB015330 


Homo sapiens 


HRIHFB2007 


631 


50 


1545 


AF198487 


Homo sapiens 


transcription factor LBP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdi t 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 


AB035495 


Carassius 
auratus 


ubiquit in- activating enzyme 
El 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ50BI15.4 (KIAA0668) 


3688 


100 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


292 


42 


1551 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


822 


44 


1552 


AL157734 


Schizosaccha 

romyces 

porabe 


putative mannosyltransf erase 
involved in N-glycosylation 


435 


37 


1553 


AF079527 


Mus musculus \ 


IER5 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 

ISMO-3 . 


1780 


99 


1556 


AF116553 


Drosophila 
melanogaster 


antennal-specif ic short-chain 
dehydrogenase /reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 



183 
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1558 



1562 
1563 



1564 



1566 



1568 



1569 



1570 
1571 



1574 



1576 



1577 



1578 



1579 



1580 



1581 



1582 



1583 



ACCESSION 
NUMBER 



Y71056 



"1559 I Y71056 



1560 I AFO92050 
1561 AI»109827 



1584 



1585 



1586 



1587 



1588 



1589 



1590 



1591 



AJ13189Q 
ALD35424 



SPECIES 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



protein, MTRP-1 . 



Tfomo sapiens | Human membrane transport 

protein, MTRP-1. 



1975 



1894 



Mus mus cuius , 

ace tylglu cosaminyl transferase 
Homo sapiens | dJ309K20.2 (acrosomal protein 

ACR55 (similar to rat sperm 
, antigen 4 (SPAG4) ) ) 
Homo sapiens | DNA polymerase lambda 



1607 



Homo sapiens 



1592 



1593 



22D12.1 (novel protein 
similar to Drosophila Kelch 

} , proteins) 

AC002400 Homo sapiens Gene product with similarity 

to Ubiquitin binding enzyme 



3002 
3015 



2790 



550 



1594 



is elegans 



1567 AB033281 



Homo 
sapiens 



domain: PF00169 (PH) 
Score=20.6 r E-value=l . 9e-05, 

N^l 

F-box and WD- repeats protein 
beta-TRCP2 isoforra C 



2879 



D49473 



AK025270 

X75756 

AF145713 



210 



1572 I AE003831 



H omo sapiens 

Drosophila 

melanogaster 



180 



1573 | AF074603 



NonF 



U28993 



Streptomyces 
griseus 
subsp . 

griseus t 

Caenorhabdit F22D3.3 gene product 



1575 I AF129507 



287 



X64878 
AF237711 



G00975 



SEQ 



AF248744 



AL121782 



Homo sapiens oxytocin receptor 
Drosophila I Diablo - 

mela nogaster 

Homo sapiens | Human secreted protein, 

I ID NO: 5056. 

Cryptosporid thrombospondin- related 

ium parvum | adhesive protein ^ 

Homo sapiens I dJ585I14.2 (novel protein 



2002 
"42T 



480 



123 



(translation of cDNA 
Em:AK000219) > 



AF041853 



345 



AF025441 



AE001803 



Thermo toga 
maritima 



349 



AF252283 | Homo sapiens | Kelch- like 1 protein 



3 973 



AF169675 



Homo 
sapiens 



AF118274 I Homo sapiens 



leucine- rich repeat 
trans membrane protein FLRT1 

DNb-5 ' 



3494 



2628 



X79440 



X99802 



AF169803 



Y29861 



Z25535 



Homo sapiens 



X13293 



Homo sapiens 



M74027 



AL139314 



Homo sapiens 
Schizosaccha 
r omy c es 



nuclear pore complex protein 

hnup!53 

B-myb protein (AA 1-700) 



mucin 

hypothetical protein 



2563 



181 



7567 



242 



97 



97 



100 



100 



100 



82 



45 



100 



91 



31 



68 



100 



33 



100 



34 



100 



97 



99 



99 



100 



47 



99 



99 



27 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649_ 3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1408 


99 


1598 


AB032254 


Homo 

• 

sapiens 


bromodomain ad}acent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaf 50 


2305 


100 


1601 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA- interacting protein 3 


2821 


99 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neut ral sphingomyel inase 


1601 


99 


1605 


AF185576 


Mus mus cuius 


POZ/zmc finger transcription 
factor ODA-8 


3435 


97 


! 1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


IFN-pseudo- omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


100 


1611 


Y03200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


; 99 


1613 


AC004481 


Arabidopsis 
t ha liana 


nodul in-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- 1, 

(CIPAR-l) 


890 


62 

* 


1617 


X58079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y66678 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 

• 


100 


| 1620 


AF150733 


Homo sapiens 


AD-014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-associated protein 


4646 


98 


I 1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AI>355013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198 . 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


Y35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203 . 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


286 


68 


1630 


AF017096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419 .03c 


493 


61 


1631 


X03077 


* * 

Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF1510B4 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 
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TABLE 2 



PCTAJS00/34263 



SEQ 
ID 
NO: 


ACCESSION | 
NUMBER 


SPECIES 




WATERMAN 
SCORE j 


D 

IDENTITY 


1635 


AF026246 j 


Homo sapiens 


HERV-n mcegrase 


411 

* 


90 


1636 


Y50943 | 


Homo sapiens 


Human bquic urain uawhc 
ve8 1 derived protein. 


112S 


95 


1637 


AF134593 | 


Homo sapiens 


Jj-pipeCOUC aClU OAlUaoc 


2068 

ii U U U | 


99 


1638 


AJ238247 


Mus mus cuius 


putative phosphatase subunit 


1948 j 


96 


I 1639 


Y94942 


Homo sapiens 


Human secreted protein cione 
yk251 1 protein sequence SEQ 

ID NO : yU . 


j. j \j ] 


100 

X w w 


1640 


AF235030 ] 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 | 


Drosophila 
melanogaster 


WDS 


"3 CQ I 
JDO 3 


^ o 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 | 


Homo sapiens 


Human membrane channel 
protem-2 (MECHP-zj . 




J.UU 


1644 


AF176520 | 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


2676 | 


"88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 




inn 


1646 


X67155 


Homo sapiens | 


mitotic kinase-like protein- j. 


a a c; c I 

ft ft 3 O | 




1647 


M63180 | 


Homo sapiens ] 


threonyl-tRNA synthetase 


1040 ) 


61 


; 1648 


Y87342 | 


Homo sapiens 1 


Human signal peptide 
containing protein HSPP-ixy 
SEQ ID NO: 119 . 


1 ecc i 


7J 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW; . 


*t X J » 


100 

J. 1/ V 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting Kinase 


ODD j 




1651 


AB015346 I 


Homo sapiens 


| Epsl5R 


4464 


99 


1652 

i » 


AL161576 j 


Arabidopsis 
thai i ana 


putative protein 


JL J 4 X 


48 


1653 


AC005313 


Arabidopsis 
thai i ana 


putative calmodulin 


«oo 


28 


1654 


AL031428 


i Homo sapiens 


| dJ184J9.1 (KIAA0601 protein) 


3526 


1 100 


1655 


AL031428 


| Homo sapiens 


| dJ184u9.1 (KIAAQoOi protein; 


J J^U 


100 ! 

I ak V V 


1656 


AB017910 


Dictyosteliu 
1 m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 




1 77 


1658 


AF056191 


| Homo sapiens 


| TPA inducible protein 




1 Qfl 


1659 


U76846 


Arabidopsis 
j thaliana 


ubiquitin-specif ic protease 


137 


35 


1660 


AL078627 


Schizosaccha 
romyces 
j pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
1 chain 


IDZ / *4 


99 


1663 


AF300648 


Homo 
j sapiens 


guanine nucleotide binding 
I protein beta subunit 4 


i 1 PI 1 


100 i 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
[ 2 


2774 


100 


1665 


Z48613 


Saccharomyce 1 unknown 
s cerevisiae 


138 


26 


1666 


AF177385 


Homo 
j sapiens 


1 cytochrome c oxidase assembly 
I protein isof orm 2 


1395 


99 


1667 


AC007842 


j Homo sapiens 


I BC331191JL 


1581 


1 47 


1668 


S67513 

• 


Boma 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 

I isolate, 

j Peptide, 370 


p40 


397 


43 
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WO 01/53312 



PCT/US0O/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






aa 








1669 


Z99753 


Schizosaccha 

romyces 

porabe 


putative N0Ll-N0P2-sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2005 


99 


1673 


Y51846 . 


Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


152 


29 


1675 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


1349 


100 


1681 


AP019236 


Dictyosteliu 
m discoideum 


TipD 


613 


34 


1682 


AJ243459 


Leishmania 
major 


proteophosphoglycan 


153 


26 


1683 


Z69369 


Schizosaccha 

romyces 

pombe 


putative GTP-binding protein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp28 


1334 


100 


1685 


AF286475 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator- like protein 


196 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


AJ275986 


Homo sapiens 


transcription factor 


2958 


100 


1688 


AJ275986 


Homo sapiens 


transcription factor 


1886 


88 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 

. 


138 

* 


43 


1690 


AF240463 


Rattus 
norvegicus 


LISl-interacting protein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


katanin p60 


1664 


66 


1694 


AF263539 


Homo sapiens 


arginine N-methyl transferase 


1774 


100 


1695 


AF222689 


Homo 
sapiens 


protein arginine N- 

methyl transferase 1 -variant 2 


1182 


81 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


3122 


100 


1698 


AB041035 


Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


Y44 676 


Homo sapiens 


Human ARF-Related Protein- 1 
(HARP-l) . 


938 


97 


1701 


AK022407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AF055078 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Mus musculus 


RP42 


1057 


77 


1705 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AL391710 


Arabidopsis 


putative protein 


505 


50 
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TABLE 2 



SEQ | 
ID j 
NO: j 


ACCESSION 
NUMBER 


SPECIES j 


DESCRIPTION . I 


SMITH- 
WATERMAN 


a. 

IDENTITY 






thai i ana | 








1710 [ 


B01311 


Homo sapiens | 


Human PR0241 polypeptide. T 


lo49 




1711 j 


U40750 


Mus mus cuius j 


formin binding protein 30 j 


4561 | 


85 


1712 j 


AJ011118 


Mus musculus 


skeletal muscle and cardiac I 
protein 1 


1490 


89 


1713 ; 


AF255303 j 

• 


Homo 

sapiens | 


membrane-associated nucleic j 
acid binding protein 


4416 1 


99 


1714 j 


AF255303 j 


Homo j 
sapiens J 


membrane- associated nucleic j 
acid binding protein ] 


r"S f\ /- il 

2960 


100 


1715 


U08227 j 


Rattus I 
norvegicus | 


Ras- related protein | 


511 j 


51 


1716 \ 


AF168795 ] 


Rattus I 
norvegicus 1 


schlafen-4 1 


1129 


A A 

44 


1717 


AF196304 \ 


Homo sapiens j 


SUMO- 1- specific protease I 


5804 [ 


99 


1718 | 


AL355737 | 


Homo sapiens 


HMG20A [ 


1782 1 


100 


1719 


AB029333 j 


Halocynthia 
roretzi j 


HrPET-1 | 


1069 


46 


1720 j 


AF071317 ( 


Mus musculus ) 


COP9 complex subunit 7£> | 


1297 j 


97 


1721 j 


AJ272215 j 


Homo sapiens | 


HEYL protein I 


1681 | 


99 


1722 I 


G01982 | 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 6063. J 


718 100 


1723 j 


AL032643 


Caenorhabdit 
is elegans ] 


similar to Uncharacterized 1 

to mm % s^wk m* *^ m 

protein family UPF0034, | 


825 41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ j 
ID NO: 6053 . 


586 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 


AF255443 


Homo sapiens : 


CGI-201 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


99 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


Z18529 


Gallus 
gallus 


tensin 


1411 


84 


1730 


Z73423 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z14908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


| AF090891 


j Homo sapiens 


| PRO0105 


} 470 


30 


1733 


| AJ277724 


| Homo sapiens 


| histone deacetylase 8 


| 2015 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
| ID NO: 8131. 


503 


95 


1735 


| D45913 


1 Mus musculus 


| leucine-rich-repeat protein 


j 3531 


| 94 


1736 


AF096709 


Drosophila 
J virilis 


failed axon connections 
1 protein 


276 


32 


1737 


1 AF195120 


1 Homo sapiens 


| dynactin p62 subunit 


j 2417 


99 


1738 


L15314 


Caenorhabdit 
| is elegans 


contains similarity to Pfam 
j family PF01772 N=l 


206 


37 


1739 


X54618 


Listeria 
monocytogene 

! s 


phosphadidyl inositol specific 
phospholipase C 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ310O13.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
| proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
j 173. 


1013 


99 


1742 


AC013354 


Arabidopsis 
| thai i ana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 

\ APD0 8 . 


1932 


59 


1744 


W75771 


Homo 
| sapiens 


Human GTP binding protein 
j APD08. 


1854 


61 


1745 


AF221098 


Homo 
I sapiens 


Ral guanine nucleotide 
j exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PR0143 0 (UNQ736) ammo 
J acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


j Y94294 


1 Homo sapiens 


| Human coenzyme A-utilising 


| 842 


100 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








en zyme CoAEN- 2 . 






1748 


AK02443 6 


Homo sapiens 


FLJ00026 protein 


1619 


100 


1749 


AE000877 


Methanobacte 
riura 

thermoautotr 
ophicum 


conserved protein 


231 


36 


1750 


AF101361 . 


Drosophila 
melanogaster 


Abnormal X segregation 


193 


33 


1751 


Y15067 


Homo sapiens 


2NF232 


889 


100 


1752 


AF251038 


Homo sapiens 


GAP- like protein 


822 


100 


1753 


AC003093 


Homo sapiens 


OX YS TEROL - B I ND I NG PROTEIN; 
45% similarity to P22059 
(PID:gl29308) 


352 


57 


1754 


X690B9 


Homo sapiens 


165kD protein 


5703 


99 


1755 


AL049795 


Homo sapiens 


dJ622L5.3 (novel protein) 


1039 


100 


1756 


AL031393 


Homo sapiens 


dJ733D15.1 (Zinc-finger 
protein) 


2765 


100 


1757 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyltransf era 
se 


2020 


99 


1758 


AL022238 


Homo sapiens 


dJ1042K10.4 {novel protein) 


776 


43 


1759 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


1760 


Y12065 


Homo sapiens 


hNop56 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56) 


2595 


99 


1762 


AC002394 


Homo 
sapiens 


Gene product with similarity 
to dyne in beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


formiminotransf erase 
cyclodeamina se 


877 


100 


1764 


U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596 


100 


1765 


AB013365 


Bacillus 
halodurans 


YlqF 


350 


34 


1766 


Y38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1767 


AC009176 


Arabidopsis 
thai i ana 


putative ribulose-i, 5- 
bisphosphate 

carboxylase /oxygenase small 
subunit N- methyl transferase I 


216 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


737 


99 


1769. 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Homo sapiens 


AMSH 


1214 


56 


1771 


U89435 


Mus musculus 


unknown 


829 


86 


1772 


S70011 


Rattus sp. 


tricarboxylate carrier 


1604 


95 


! 1773 


AL035086 


Homo sapiens 


dJ44A20.2 (novel protein) 


2036 


100 


| 1774 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


1057 


99 


1775 


AF110330 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ269529 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


Z81579 


Caenorhabdit 
is elegans 


cDNA EST yk76fl.5 comes from 
this gene 


232 


31 


1778 


AY007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romyces 

pombe 


oxysterol -binding protein 
family 


644 


38 


1780 


AF2S4260 


Homo sapiens 


tuftelin 1 


1729 


100 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49 

• 


1783 


AK024475 


Homo sapiens 


FInJ00068 protein 


4333 ; 


100 


1784 


AK024475 


Homo sapiens 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014 . 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda- like gene/beta- 


247 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 








glucuronidase exon 11 homolog 







TRADOCS : 1 4 1 6280. 1 (%CT40 1 ! .DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.2S0e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 

AAA V* A^ Wk u*AA* 

SIGNATURE 


; PR00109D 17.04 8.085e- 
1 13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 

10 1129-1146 BL00028 
Iff 07 1 9 , ?7f3-0Q P5n. 

X O * KJ 1 X.<£^/G \S27 o«U 

837 


5 


BL00023 


Type II fibronectin 
collagen -binding domain 

nrnhpi ns 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
74 11 4. ^4^-77 mi- 

390 


6 


BL00023 


Type II fibronectin 
collagen-binding domain 


BL00023 24.31 8.920e- 
33 413-450 BL00023 

74 11 4 c;4c;f»-.77 1^1- 

390 


7 


BL00023 

AV AjJ W V/ W A* «J 


Tvnp TT f ibr*nnf a r , t* ^ n 

collagen -binding domain 
proteins . 


RT.00071 74 11 fl Q7 Op» - 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


r I\rne± TT f ibronectin 

collagen -binding domain 
mroteins 


RL00071 74 11 fi cj7f)<a- 

33 413-450 BL00023 
24 31 4 545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
retreat Droteins 


BL01160B 19.54 5.119e- 
09 863-917 


10 


PR00464 


E-CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PRnfl4fi4f3 

12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYC0SYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.868e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


CU\J KJ U O D 


METAL- BIND I . 


fJJvUUOO 1j >7Z g ,ZUUc 

14 282-295 PD00066 
11 97 9 400*^-14 477- 

490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-80 


20 


BL00487 


Jplu UCUYULUhCUCISC / wl 'ItT 

reductase proteins. 


26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP ; 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* j 


23 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- | 
26 302-333 j 


25 


BL00115 

• 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins. 

■ 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q ! 
18.08 2.776e-21 953- 1 
983 BL00115Y 11.86 
8.000e-17 1604-1650 | 
BL00115M 19.19 8.130e- ! 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- \ 
14 983-1010 BL00115J 
16.71 9.289e-14 591- 
617 BL00115I 8.33 | 
4.336e-13 535-590 j 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- j 
463 BL00115K 15.03 j 
3.417e-10 617-659 1 
BL00115O 16.76 5,805e- 
10 863-913 BL00115P j 
11.54 7.538e-l0 913- j 
953 BL00115S 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 ] 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 | 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 B.125e-12 133- j 
147 J 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- j 
10 41-54 1 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- j 
09 486-516 | 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins . 


BL00557D 17.76 5.065e- 
37 274-316 BL00557A | 
35.08 8.909e-29 24-73 
BLO0557C 15.59 l.OOOe- | 
28 227-257 BL00557B j 
21.27 8.898e-22 130- I 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 | 
3.786e-27 224-247 j 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C j 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 | 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN . 


PDO1270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22. lo 2.8/be-Jo y4-IJl 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodul in (GAP -43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


3B 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodul in (GAP-43 ) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PR00380 

« 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13 .96 2.452e-14 204- 
223 


45 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13 .96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14 .62 3.571e-17 232- 
252 DM01551B 8.84 
4 .750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


4B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 

* 


BL00972D 22.55 7 . 750e- 
19 994-1019 BL00972A 
11.93 7.120e-l8 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.l00e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.2ebe-lo b4b-Db^ 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 b.oUUe- 
10 153-203 


58 


PF00791 


Domain present in ZO-i 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.75 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 /.jyoe- 
09 680-693 J 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 / . jSDe- 
09 670-683 1 


70 


PF00651 
• 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 a. /I4e- 

10 51-64 j 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 

09 lQo-llo j 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 J 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 6.116e- 
10 93-120 [ 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B | 
8.45 4.857e-12 70-81 j 


80 


PD02876 


DECARBOXYLASE 1 
PHOS PHAT I DYLSERINE . 


PD02876C 8.80 2.723e- j 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 ! 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.5B8e-l2 393- j 
410 _J 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7 . 19 /e- 
12 570-601 


84 


PRO 00 14 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 1 
09 985-1004 1 


86 


PR00678 


PI3 KINASB P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- | 
09 246-269 j 


89 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 o.2Uue- 
09 264-279 PR00320B 
12.19 o .DjUc-u? 
279 j 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 ^.58Be- 

i/i no 1 
14 iJLb-JJA J 


95 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL001O7A 18.39 4.000e- 
10 123-154 _| 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE /RIB I TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- ) 
13 134-146 PR00081A j 
i n Kl 9 c 00e-12 54-72 j 


98 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D j 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C j 
13.18 2.756e-13 560- 
579 1 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP - DE PENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR003 00A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


! 106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 8.560e- 
13 36-67 


119 


PRO 0529 

* 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- ! 
10 158-177 


120 


PRO 0320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121. 


PRO032O 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.9?2e-ll 282-296 
BL01032I 10.42 8 . 902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 

family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


R I BO KINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- j 
09 119-133 


133 


BL00880 


Acyl - CoA-binding 
protein. 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODUL IN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


136 


BLO1310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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SEQ ID NO 



141 



ACCESSION 
NO. 



BL00501 



BL01020 
PD01066 



149 



BL00126 



151 
Tsl" 



BL00632 
BL00559 



155 



PR00449 



157 



BL00406 



160 



BL00132 



165 



PR00109 



168 



BL00362 



169 



BL00039 



175 



178 



179 



PR00449 



BL01310 



PD01066 



DESCRIPTION 



Signal peptidases I 
serine proteins. 



SARI Family proteins 

PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 



RESULTS* 

BL00028 16.07 5.500e- 
13 74-91 BL00028 
16.07 9.l00e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 



BL00501D 16.69 9.538e 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 



BL01020C 15.35 7.722e 

20 79-130 

PD01066 19.43 6.400e- 
25 335-374 



3»5» -cyclic nucleotide 

phosphodiesterases 

proteins. 



BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 



Ribosomal protein S4 
proteins . 
Eukaryotic molybctopterin 
oxidoreductases 
proteins . 



BL00632 23.79 5.271e- 
20 106-149 



BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



PR00449A 13.20 1.692e- 
13 13-35 



Actins proteins. 



Zinc carboxypeptidases, 
zinc -binding region 1 
proteins . 

TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 



BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 

12 69-124 BL00406C 
6.75 9.682e-12 128-183 
BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 

145 

PR00109B 12.27 9.043e- 

13 139-158 



Ribosomal protein S15 
proteins 



BL00362 24.67 9.700e- 
15 129-172 



DEAD -box subfamily ATP- 
dependent helicases 
proteins . 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



ATP1G1 / PLM / MAT 8 

family proteins. 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 



BL00039D 21.67 l.OOOe- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 



PR00449A 13.20 3.721e 
12 14-36 



BL01310 14.74 2.432e- 
29 133-169 



PD01066 19.43 9.4 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR0O0O7A 
19.33 4.938e-19 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


181 


BL00027 


' Homeobox 1 domai n 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


182 


BL00027 


' Homeobox • domain 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


• Homeobox ' domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


1 Homeobox 1 domai n 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT - HOOK- LI KE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 

• 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF00564 


Octicosapeptide repeat 
proteins. 


PF00564B 24.74 6.l64e- 
16 227-278 


194 


PR00503 


BROMODOMAIN SIGNATURE 

■ 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


BL00901 


Cysteine 

synthase /cystathionine 
beta- synthase P- 
phosphate att . 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL0063 6 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 



197 
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SEQ ID NO: 


ACCESSION 1 
NO. J_ 


DESCRIPTION j 


RESULTS* 




* II 




4.833e-18 143-165 j 
PR00261D 12.47 7.500e- j 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 1 
16 143-165 PR00261F 
11.57 4.938e-13 143- | 
165 PR00261E 11.08 
7.188e-13 65-87 | 
PR00261F 11.57 7.l88e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 I 


209 


PF00791 j 


Domain present in ZO-1 j 
and Unc5-like netrin j 
receptors . 


PF00791B 28.49 6.143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 J 


211 


PR00007 j 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.781e- j 
19 131-158 PR00007B 
14.16 4.1l5e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 j 
PR00007D 9.64 7.231e- 
11 233-244 J 


212 


BL00183 j 


Ubi qui tin- conjugating 
enzymes proteins. j 


BL00183 28.97 1.545e- 
30 43-91 1 


213 


BLO0183 


Ubiqui tin-conjugating 
enzymes proteins. | 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 [ 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- j 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 1 


217 


BL00100 


1 Chloramphenicol 

acetyl transferase 
j proteins. 


BL00100D 17.22 8.484e- j 
09 68-106 j 


219 


PRO 0213 


MYELIN PO PROTEIN 
I SIGNATURE 


PR00213C 15.94 3.969e- j 
11 199-227 


222 


BL00678 


Trp-Asp (WD.) repeat 
proteins proteins . 


BL00678 9.67 1.947e-09 j 
144-155 J 


224 


PR00875 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- j 
09 901-913 | 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- j 
j 19 18-39 J 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 l.OOOe- j 
21 21-38 BL00636B 
j 15.11 8.200e-19 45-66 I 


229 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G 
I 13.78 4.300e-12 361- 
382 ] 


230 

1 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- j 
20 35-70 BL00460B 
9.73 7.429e-l6 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D j 
16.89 8.773e-ll 140- 
1 1^0 j 


231 


PR0O647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


j 09 273-287 j 


233 


BL00292 


Cyclins proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A | 
22.87 7.750e-27 201- | 
235 


234 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


J PR00449A 13.20 6.308e- 
1 13 7-29 PR00449C j 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PRO 00 19 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


240 


PRO 00 11 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidine and 
deoxycyt idylate 
deaminases zinc-binding 
region s. 


BL00903 12.93 8.941e- 
12 54-64 

• 


245 


DM00179 


W KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 8.043e- 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00674 


AAA- protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6.045e- 
09 61-88 


256 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.800e- 
10 421-435 


258 

• 


PR00094 

• 


ADENYLATE KINASE 
SIGNATURE 

• 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 


BL00388 


Proteasome A- type 
subunita proteins. 


BL00388A 23.14 l.OOOe- 
40 8-54 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


264 


BL00903 


Cytidine and 
deoxycyt idyl ate 
deaminases zinc -binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 • 


BL00226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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SEQ ID NO: 



ACCESSION 
NO. 



271 



PD02952 



272 



PD02929 



274 



BL01027 



275 



PR00424 



277 



BL00052 



279 



BL00790 



280 



PR00319 



281 



PR00319 



287 



PF00929 



291 



BL00326 



292 



BL00326 



294 



PD0O066 



295 



BLOO028 



RESULTS* 

23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e~ 
15 96-111 



KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULT I GENE FAMI. 



16 235-265 PD02952B 
15.57 5.625e-09 215- 
229 



ADHESION GLYCOPROTEIN 
PRECURSOR I. 



PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 



Glycosyl hydrolases 
family 39 proteins. 



BL01027B 15.34 3.486e- 
09 213-250 



ADENOSINE RECEPTOR 
SIGNATURE 



PR00424D 14.32 
11 39-59 



Ribosomal protein S7 
proteins . 



Receptor tyrosine kinase 
class V proteins. 



BL00052A 27.85 6.000e- 
13 137-184 BL00052B 
15.17 5.143e-l2 208- 
235 



BL00790N 13.25 5.659e- 
13 267-294 



BETA G- PROTEIN 
(TRANS DUCIN) SIGNATURE 



PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 1.000e-2l 89-105 
PR00319A 15.27 8.3 64e- 
21 51-68 PR00319B 
11.47 8.200e-l9 70-85 



BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 



PR00319D 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-l9 57-72 



Exonuclease . 



Tropomyosins proteins. 



BL00326A 14.01 2.360e 
09 93-127 



PROTEIN ZINC- FINGER 

METAL- BIND I . 

Zinc finger, C2H2 type, 
domain proteins. 



BL00326A 14.01 2.360e- 
09 93-127 



PD00066 13.92 8.714e- 

12 203-216 

BL00028 16 .-07 5-bUUe- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 

13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 5.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 815-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 

• 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PF00152 


tRNA synthetases class 
II. 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.2S0e-2l 220- 
257 PF00152B 15.67 
2.658e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


30S 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEO P ROTE I N . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


30B 


PR00237 

■ 


REODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.091e- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4 .375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA polymerase family X 
proteins. 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19.63 8.615e- 
14 430-460 BL005?2B 
27.30 9.625e-12 267- 
313 


310 

* 


BL00326 


Tropomyosins proteins* 


BL00326D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


3L00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR0O1O9B 12.27 4.814e- 
10 216-235 



7 
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SEQ ID NO: 1 


ACCESSION 
NO. 


DESCRIPTION 


KCiOUJU 1 O 






SIGNATURE T 




321 


BL00027 


1 Homeobox' domain j 
proteins . 


BL00027 26.43 5.688e- 
10 329-3 l£ 


322 


PR00109 


TYROSINE KINASE 

CATALYTIC DOMAIN j 

SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


1 324 j 


BL01241 


Link domain proteins. j 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222e-13 282- 
335 


326 


BL00412 T 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.B48e-l0 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


I 32B | 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C IO.od 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


| Chromo domain proteins. 


BL00598 14.45 8.393e- 
1 18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 


BL01016C 22.84 3.925e~ 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 i 
BL01016A 5.65 7.167e- j 
10 4-19 BL01016F 
13.34 Jl.Dbje-uy rfjuu— 
212 BL01016B 8.93 

a QCCfl no Ifl — E?fl 
b.o5I>e— U7 JO~3v 


339 


BL01115 


GTP- binding nuclear 
1 protein ran proteins . 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
| BINDING NU. 


rUUXUOO 17.43 i .iiJic- 

33 10-49 


| 341 


BL01160 


Kinesin light chain 
| repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


t PD01066 


PROTEIN ZINC FINGER 
\ BINDING NU. 


PD01066 19.43 2.4D0e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


j PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
j SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


1 347 


j PR0O1O9 


I TYROSINE KINASE 


~ PR00109B 12.27 4.764e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
niroteins 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12 04 8 435e-13 97fi- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL003 80 


Rhodanese proteins. 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PF00628 


PHD-f inger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 4.462e- 
15 261-274 PD00066 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4 300e-09 289-30? 


361 

• 


PF00791 

■ 


Domain present in ZO-1 
and Unc5-like netrin 
recent ors 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28 49 1 095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 

> 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12 22 3 278e-09 109- 
131 


364 


PF00242 


DNA oolvmerase (viral) 
N- terminal domain 
proteins . 


PF00242O 13.51 2 328e- 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
N-terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 


PR0OO19 

• 


LEUCINE - RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9 OOOe- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BLQ0478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 
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SEQ ID NO: 1 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


r 






10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins . 


23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 T 


BL00455 


Putative AMP- binding 
domain proteins. 


uT.nn4^ \% 31 5 714e- 
12 50-66 


382 1 


PR00624 T 


HI STONE H5 SIGNATURE 


09 524-544 


384 1 


PD00078 j 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


cnfinmuR 1 14 ^ 950e- 
10 366-379 PD00078B 
i-i 14 4 522e-09 168- 

181 


385 


PR00511 T 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 


386 


PD02870 


RECEPTOR I NTERLEUK IN - 1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


388 j 


PD00066 ] 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 S.OOOe- 
13 516-529 


389 


BL00290 | 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.667e- 
09 151-174 


390 ] 


BL00215 | 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.200e- 
15 221-246 BL00215A 
15.82 7.613e-14 ^u-4o 
BL00215A 15.82 8.851e- 

11 1^.J-1**0 131JU UZ1 JD 

10.44 9.526e-ll 69-82 
tit Ami ctj i n 44 7 n np^ — 

09 272-285 BL00215B . 
in 44 fl c;00ft-09 165- 

178 


394 


BL00674 


j AAA-protem family 
proteins. 


16 299-321 


397 


PR00048 


| C2H2-TYPE ZINC FINGER 
| SIGNATURE 


"PR00048A 10.52 8.579e- 
11 141-155 


398 


PR00761 


BIND IN PRECURSOR 
j SIGNATURE 


PR00761B 9.93 6.764e- 


399 


BL00240 


Receptor tyrosine kinase 
| class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
I component. 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-44b3 BjjuUbi*fctj 
15.98 6.092e-14 4555- 

2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


r if K) U j -J XD . □ / zi - j > 

09 105-140 


404 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins. 

m 


BL00232B 32.79 9.bb7e- 
20 139-187 BL00232B 
32.79 2.246e-l8 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.384e-15 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine - nucleot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 

- 

■ 


PF00791 

• 


Domain present in ZO-1 
and UncS-like net r in 
receptors. 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20. 9B 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 1B9-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 

* • 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins . 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-16 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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1 SEQ ID NO: 1 


ACCESSION [ 
NO. | 


DESCRIPTION 1 


RESULTS* j 


r 




piS. | 


10 183-218 PF01140D 1 
15 54 3.093e-09 246- 
281 


449 


PR00568 


DOPAMINE D3 RECEPTOR j 
SIGNATURE 


PR00568G 13.95 5.551e- 
09 39-53 1 


451 


PF00084 j 


Sushi domain proteins J 
(SCR repeat proteins. | 


PF00084B 9.45 3.813e- 
10 47-59 i 


452 


BL0Q790 1 


Receptor tyrosine kinase 
class v proteins . j 


BL00790I 20.01 2.821e- 


456 


PR00380 i 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 j 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-l6 194- | 


457 


PR00253 


GAMMA- AMINOBUTYRIC ACID j 
(GABA) RECEPTOR 
SIGNATURE j 


PR00253A 9.15 9.143e- 

13.47 2.000e-23 272- 
PDn09^3C 13 85 

7.000e-23 306-328 
PP00253D 16 68 5 . 950e- 1 
21 452-473 | 


467 


PR00849 


GLxCUoxLi £1 X Ut\\JLu\.0 Ct | 

FAMILY 58 SIGNATURE j 


PR00849D 9 77 9.236e- 

09 910-937 J 


471 


BL00678 j 


Trp-ASp twu/ repeat- 
proteins proteins . j 


RT.nn678 9 67 8.200e-12 
33-44 | 


472 


BL00226 


intermediate iiiauientB 
proteins . 


RT.0D226B 23 86 3 721e- i 
09 282-330 j 


473 


BL00344 


GATA- type zinc finger j 
domain proteins. 


BL00344 17.99 7.000e- ] 
17 814-852 


474 


BL00481 


Thiol-activated 
cytolysins proteins. 


BL00481E 13.07 8.909e- j 
09 173-199 J 


479 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 ! 


480 


1 PD0106S 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


38 8-47 | 


481 


PR00405 

| * 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- j 
19 451-473 PR00405B 
11.83 4.333e-18 430- j 

I 44 o FKUU4UDA xf . fX | 

4.9716-18 411-431 j 


[482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- j 
10 959-974 PR00049D j 
0.00 9.857e-10 958-973 j 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
n nn « ^'>0r»-n^ 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8.615e- 
51 fit;T-K7^ PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 


PD00567B 18.23 2.853e- 
09 200-214 ! 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- j 
12 3-21 1 


A D Q 




PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 

27 30-69 PD01066 

19.43 3.430e-10 71-110 [ 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- | 
09 663-678 


|492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 j 


1 497 


! PF00429 


ENV polyp rote in (coat 


PF00429 31.08 7.171e- || 
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1 SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


1 RESULTS* 1 






polyprotein) . 


15 21-71 | 


498 


BIiO012O 


Lipases, serine 
proteins . 


BL00120B 11.37 7.923e-~H 
09 185-200 


500 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- [ 
11 299-318 j 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8.579e- | 
12 131-146 1 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 J 


508 


PR00120 


H +TRANS PORT I NG ATP AS E 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM01417 


6 lew INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D j 
11.08 3.800e-13 322- 
338 ; 


510 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 j 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 293-317 J 


512 


PF00534 


Glycosyl transferases 
group 1. 


( PF00534B 14.47 6.62Se- | 
j 09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


1 PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 

| l.OOOe-40 243-295 j 
PD01841F 13.36 l.OOOe- j 
40 333-382 PD01841Q 
24.26 l.OOOe-40 386- | 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 

! 37 762-804 PD01841E j 
18.60 3.750e-36 295- | 
333 PD01841J 14.94 
6.023e-35 851-888 J 
PD01841H 21.30 2.909e- | 
33 490-527 PD01841K | 
14.81 7.088e-33 924- 
954 PD01841C 13. 7B 
9.386e-23 222-243 
PD01841M 10.82 8.594e- 1 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 J 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- | 
PROLYL CIS -TRANS ! 
ISOMERASE SIGNATURE j 


PR00153C 11.01 7.188e- ! 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 j 


515 


BL00740 


MAM domain proteins. j 


BL00740A 13.87 7.188e- 
12 410-423 | 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins . | 


BL00242C 16.86 8.320e- 
09 12-42 | 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- j 
39 20-68 DM00031B j 
15.41 1.000e-25 84-118 ] 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . j 


BL00319C 17.12 8.375e- 
10 61-95 


1 526 


PF00789 


Domain present in j 
ubiquit in -regulatory 
proteins. 1 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 j 


528 


BL01162 


Qui none oxidoreductase / 
zeta-crystallin | 
proteins . j 


BL01162C 22.80 1.500e- 
16 120-164 



207 . 



WO 01/53312 



PCTYUS00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl- enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN - CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 

• 


BL00028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-386 
BL00028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4 .462e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


53 7 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins . 


BLO025OA 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


I 547 


PRO 03 19 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






(TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 


BL01204 

• 


NF-kappa-B/Rel/dorsal 
domain proteins . 


BL01204A 17.74 l.OOOe- 
40 8-56 BLO1204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


549 


PR00326 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


551 


PF00632 


HECT-domain (ubiquitin- 
transf erase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly- adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-l9 524- 
553 


575 


BL00752 

• 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: | ACCESSION 

NO. 



DESCRIPTION 



RESULTS* 



578 



BL00195 



proteins 



Glutaredoxin proteins 



13 864-877 BL00116B 
11.82 1.529e-12 952- 

965 

BL00195B 15.31 7 . 158e- 

09 121-141 

PR00019B 11.3 6 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 



579 



PR00019 



LEUCINE-RICH REPEAT 
SIGNATURE 



580 



PR00253 



583 



PRO 03 43 



584 



DM01537 



GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 



PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.B46e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 



SELECTIN SUPER FAMILY 
COMPLEMBNT-BINDING 
REPEAT SIGNATURE 



kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 



PR00343C 16.85 2.2B6e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16. B5 8.230e-10 1686- 

1705 

DM0153 7B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 
3.1B6e-ll 784-804 



586 



PF0O013 



KH domain proteins 
family of RNA binding 

pro teins . 

3 RETROVIRAL PROTEINASE 



PF00013 
124-136 



5.78 1.450e-09 



587 



DM00892 



589 



BL00478 



590 



PF00855 



591 



PF00855 



LIM domain proteins . 



DM00892C 23.55 4.409e- 

13 262-296 

BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 



PWWP domain proteins 



PWWP domain proteins 



PF00855 13.75 8.000e- 

15 931-948 

PF00855 13.75 8.000e- 
15 1062-1079 



593 



PF00628 



594 



PR00205 



PHD- finger. 



CADHERIN SIGNATURE 



PF00628 15.84 3.455e- 

12 424-439 

PR00205B 11.39 2.241e- 
16 558-576 PR00205A 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 



596 


BL00107 


Protein kinases ATP- j 
binding region proteins . | 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 1 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-89 


600 


BL00242 


Integrins alpha chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.115e-26 286- 
316 BL00242D 13.57 
4.l50e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.000e-ll 61-73 
BL00242D 13.57 4.9B6e- 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-213 


602 


PRO0278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.569e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


615 


PD02699 


PROTEIN DNA-BINDING 
BINDING DNA. 

* 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455' 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.0B6e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3.160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYBDOPTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



subunit proteins . 



RESULTS* 



24.37 1.000e-40 255- 
308 BL0Q641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL0C641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 



627 



PR00103 



630 



PR00081 



CAMP -DEPENDENT PROTEIN 
KINASE SIGNATURE 



PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 



GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 



PR00081A 10.53 6.211e- 
16 4-22 



631 



PF00651 



632 



DM01206 



635 



BL00107 



BTB (also known as BR- 
C/Ttk) domain proteins. 



PF00651 15.00 8.500e- 
14 37-50 



CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 



DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10 .69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 



Protein kinases ATP- 
binding region proteins 



BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 



636 



BL00657 



637 



BL00107 



643 



BL00018 



Fork head domain 
proteins . 



Protein kinases ATP- 
binding region proteins 



EF-hand calcium- binding 
domain proteins. 



BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 

192 

BL00107B 13.31 l.OOOe- 

10 607-623 

BL00018 7.41 4.913e-09 
199-212 



647 



PF00628 



648 



BL01129 



PHD -finger . 



PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 



Hypothetical 
yabO/yceC/sfhB family 
proteins. 



BL01129E 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 



649 



BL01228 



Hypothetical cof family 
proteins. 



BL01228D 17.44 3.908e- 
10 455-480 



650 



651 



653 



BL00027 



BL50002 



PR00253 



•Homeobox 1 domain 
proteins. 



BL00027 26.43 6.684e- 
13 771-814 



Src homology 3 (SH3) 
domain proteins profile 



BL50002A 14.19 1.750e- 
12 1026-1045 



GAMMA - AM INOBUT YR I C ACID 
(GABA) RECEPTOR 
SIGNATURE 



PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






* 


20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-l0 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 

09 580-595 j 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 

9 824e-ll 551-584 
DM00215 19.43 2.929e- 

10 548-581 DM00215 
19.43 4.054e-10 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.518e- 
09 224-236 


661 


BL00027 


' Homeobox 1 domain 
proteins . 


BL00027 26.43 5.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


666 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.988e- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


6643 

; 

! * 

k 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.557e- 
10 106-123 


674 1 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 514- 
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SEQ ID NO: | ACCESSION 

NO. 



676 



PRO 00 19 



679 



680 



681 



682 



687 



PFO0642 



PROO308 



BL00 019 



PR00700 



PR00049 



DESCRIPTION 



RESULTS' 



629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13 .01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 



LEUCINE-RICH REPEAT 
SIGNATURE 



PR00019A 11.19 9.667e- 
09 249-263 



Zinc finger C-x8-C-x5-C 
x3-H type (and similar) 



PF00642 11.59 3.700e- 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 



TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 



PR00308C 3.83 
10 286-296 



8.754e- 



Actinin-type actln- 
binding domain proteins 



BL00019D 15.33 4 . 200e- 
19 227-257 



PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 



PR00700D 12.47 4.000e- 
09 99-118 



WILM'S TUMOUR PROTEIN 
SIGNATURE 



PR00049D 0 
10 538-553 



00 8.500e- 



689 



691 
"692" 



693 



694 
696" 



697 



698 



700 
"70T 



BL01024 



BLO0027 
BL00211 



BL00211 



BL00211 
BL00680 



BLO0741 



DM01930 



PR00869 
PR00048 



702 



BL00523 



Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 



• Homeobox ' domain 
proteins 
ABC transporters 
proteins . 



BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 1.000e-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13.22 1.000e-40 185- 
222 BL01024E 11.96 

I. 000e-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 1.000e-40 317- 
349 BL01024H 13.88 
1.000e-40 389-442 
BL00027 26.43 8.071e- 

31 152-195 

BL00211A 12.23 5.050e 
09 45-57 



ABC transporters family 
proteins . 



BL00211A 12.23 5.050e< 
09 45-57 



ABC transporters family 
proteins . 
Methionine 
aminopeptidase subfamily 
1 proteins. 



BL00211A 12.23 5.050e 

09 58-70 

BL0068O 14.37 5.304e- 
17 173-195 



Guanine -nucleot 
dissociation stimulators 
CDC24 family sign. 



BL00741B 14.27 3.418e- 
11 242-265 



2 kw FINGER SMCX SMCY 
YDR096W. 



DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-l0 37-71 



DNA-POLYMERASE FAMILY X 

SIGNATURE 

C2H2-TYPE ZINC FINGER 
SIGNATURE 



PR00869A 12.80 1.281e- 

16 245-263 

PR00048A 10.52 2.174e 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.fi26e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 



Sulfatases proteins 



BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-l6 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEO ID NO- 


NO. 


DESCRIPTION 










148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00787A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 


PR00761E 14.32 8.500e- 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 

* 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 

XU j Do ~ j 1 O 


713 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 


BL00039D 21.67 7.545e- 
27 450-496 BL00039A 

IP 44 m7i=-1fl ii7„ 

XO.^X^X ^>._J.J/C XO XTt / 

186 BL00039C 15.63 
2 21Se-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
nroteins 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PP00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 

- 


BL00243 ' 


Integrins beta chain 
cysteine-rich domain 
proteins . 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3 .077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 

J .£JVC £.X. DO 01 

BL00243H 17.53 7.167e- 
16 477-503 BIj0024,3H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION I 


RESULTS * 








PR00704B 17.94 2.241e- 
9^ 7^-Qft PP00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE T 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 1 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE j 

• 


PR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 i.iiue-n *tif- 
292 PR00320C 13.01 
4.522e-ll 323-338 

rKUUjAuA lb . f*± O . DOOc- 

11 323-338 PR00320B 

33B PR00320B 12.19 
6.914e-10 277-292 


731 


PR00195 


DYNAMIN SIGNATURE j 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . j 


PF00642 11.59 9.082e- 
i.U /o /- /Ho 


73 B 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 

9.100e-l3 160-184 

11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 

1 JbJ 


I 742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
| 12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 

• 


BLOOyobC AS. la l.uuue- 
40 256-305 BL00965B 

14./.// 1 . OUwc-43 i^O 

153 BL00965A 10.57 


| 747 


BL00021 


Kringle domain proteins. 

* 


BL00021D 24.56 4.563e- 
1 25 231-273 BL00021B 
j 13.33 5.345e-21 60-78 


748 


BL00612* 


Osteonectin domain 
proteins . 


11 93-126 


749 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


j 10 135-157 


1 752 


BL00795 


; Invoiucnn proteins . 


RT.nmQ^f* 17 06 6 OOOe- 
11 384-429 BL00795C 
17 06 9 444e-ll 370- 
| 415 


754 


BL00051 


Rioosomai protein jjj^e 
proteins. 


BL00051 20 92 1.935e- 
i| 16 4-50 


1 755 


DM01970 


0 KW ZK632.IZ lUKiUL 
ENDOSOMAL III. 


DM01970B 8 60 7.723e- 
| 09 171-184 


760 


BL01020 


SARI family proteins. 


RT.01090P 15 35 9.020e- 
1 12 99-150 


762 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
1 10 206-240 


764 


BL00027 


•Homeobox' domain 
proteins . 


BL00027 26.43 8.800e- 
1 29 417-460 


767 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
I 10 309-324 BL01208B 
j 15.83 8.031e-10 165- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.l62e-09 85-100 


770 


BL00031 


Nuclear hormones 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 9.571e- I 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- | 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 | 


! 773 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 9.333e- j 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 j 
BL00523B 8.64 2.607e- 
13 91-103 BL00523D 
9.89 7.923e-12 224-236 
BL00523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- | 
384 j 


775 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 568-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- j 
09 621-638 j 


111 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 | 


IIS 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins . 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 j 


119 


PR00079 


GLUCOSE - 6 - PHOS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- ! 
26 193-222 PR00079E j 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 J 


781 

• 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- ! 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 j 
159-173 [ 


785 


BL0O690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins. 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 1 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A j 
13 .20 5.235e-14 8-30 
PR00449E 13.50 2.853e- | 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 f 


788 


DM01206 ; 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.767e- 
10 1-21 | 


790 


BL00915 


Phosphatidyl inositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B | 
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j SEQ ID NO: 1 1 
I 1 1 


ACCESSION i 
HO. _ 


DESCRIPTION I 


IESULTS* j 






( 

: 


22.78 5.050e-33 633- 
571 BL00915D 27.02 
L.529e-21 795-831 j 
3L00915A 10.09 l.OOOe- \ 
13 395-407 j 


791 j 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 

- 


PR00208A 12.59 6.294e- j 
10 120-138 PR00208A ] 
12.59 6.294e-10 121- \ 
139 PR00208A 12.59 j 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 j 
6.294e-l0 125-143 1 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-l0 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 








i O CQ 7 A11f»-OQ 110- 1 


1 




148 PR00208A 12.59 j 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 

09 132-150 PR00208A ! 

1 0 co ft 974e-09 118- i 
136 PR00208A 12.59 j 
8.274e-09 119-137 


|795 


PR00205 


CADHERIN SIGNATURE ~ 


PR00205B 11.39 5 . 034e- 1 
16 302-320 PR00205A | 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 j 
1.333e-ll 337-352 [ 


796 

i * 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 

• 


BL0O412D 16.54 4.000e- j 
12 196-247 BL00412D j 

15. OH 3. / Ujc -LA .1,-7 / 1 

248 BL00412D 16.54 
7 ftdflp-10 199-250 1 
BL00412D 16.54 1.827e- ) 
09 195-246 BL00412D 
16.54 1.918e-09 194- j 
245 BL00412D 16.54 
j 2.102e-09 201-252 J 


797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 


799 

• 


BL01052 


Calponin family repeat 
I proteins. 


BL01052C 18.51 l.OOOe- 
! An A7-127 BL01052A 
16.12 1.529e-32 3-35 j 
BL01052B 15.31 1.257e- f 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
1 194 J 


f~B0O 


BL00348 


p53 tumor antigen " 
proteins . 


BL00348F 23.19 3.714e- 
09 197-240 1 


801 


BL00309 


Vertebrate galactoside- 
1 binding lectin proteins. 


BL00309C 18.65 1.621e- 1 
09 62-87 1 


802 


j PR00245 


1 OLFACTORY RECEPIOK 
SIGNATURE 


PRD0245D 10.47 5.224e- j 
09 187-199 


t a r\ a 




1 Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 

SIGNATURE 


" PR00667C 11.71 y.875e- 
I 09 12-28 1 


| 810 


PD02346 


PHOTOS YSTEM II PROTEIN 
1 PRECURSOR 


PD02346F 12. By 4.340e- 
09 317-354 J 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00685 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13 .92 4 ,429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Pep t idyl- tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BL00520 


Interleukin-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE M ETALLOTH I ONE I N 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVO PROTE I N PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732e- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 

* 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 

• 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEO PROTE IN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: 


ACCESSION I 
NO. J 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


"844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING HEP.. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 ] 


845 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 1 
09 203-230 | 


846 


BL00518 


Zinc finger, C3HC4 type 
| (RING finger), proteins. 


BL00518 12.23 4.429e- . 
10 15-24 I 


849 


BL00S18 


Zinc finger, C3HC4 type 
1 (RING finger) , proteins. 


BL00518 12.23 l.OOOe- j 
j 08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
I PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
1 09 12-27 


j 851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


j PD02411 21.89 7.000e- 
1 16 246-280 



852 



BL00420 



853 



Speract receptor repeat 
proteins domain 
proteins. 



BL00420 



Speract receptor repeat 
proteins domain 
proteins . 



40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 5.464e-20 377- 
432 BL00420B 22.67 
2.800e-l5 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 



BL00420B 22.67 l.OOOe- 
40 756-B11 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-21B 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-l5 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-578 


857 


PR00388 


3 • , 5 1 -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


EuJcaryotic RNA- binding 
region RNP-1 proteins. 

• 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-l6 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 

■ 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 ' 


866 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 

* 


BL01287 


RNA 3' -terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.68Be~ 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 

* 


BL0018B 


Biotin-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PRO 03 2 7 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS' 



898 



BL0O039 



SIGNATURE 



09 313-328 



DEAD -box subfamily ATP- 
dependent helicases 
proteins. 



BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL0003 9C 15.63 9.460e- 
11 236-260 



901 



PD00066 



902 



BL01115 



903 



PR00806 



904 



PR00381 



906 



PR00345 



PROTEIN ZINC -FINGER 
METAL- BINDI. 



PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-l6 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8 .200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 



GTP -binding nuclear 
protein ran proteins 



BL01115A 10.22 9.321e- 
11 6-50 



VINCULIN SIGNATURE 



PR00806B 4.28 9.160e- 
09 97-111 



KINESIN LIGHT CHAIN 
SIGNATURE 



STATHMIN FAMILY 
SIGNATURE 



PR003B1E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3 .288e-22 370-392 
PR00381F 9.13 7.181e- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 



PR00345C 4.54 8.557e- 
09 525-549 



PR00345C 4.54 8.557e- 
09 513-537 



907 



PR00345 



STATHMIN FAMILY 
SIGNATURE 



BL00678 9.67 9.308e-ll 
144-155 



908 



BL00678 



Trp-Asp (WD) repeat 
proteins proteins . 



910 



PD01066 



912 



BL01104 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 

BINDING NU. 

Ribosomal protein L13e 
proteins. 



PD01066 19.43 2.800e- 
30 48-87 



BL01104C 15.14 6.000e- 
09 364-392 



922 



BL0Q678 



923 



PR00320 



924 



PD02181 



926 



BL00019 



928 



BL00678 



Trp-Asp (WD) repeat 
proteins proteins. 



BL00678 9.67 3.842e-09 
500-511 



G-PROTEIN BETA WD- 40 
REPEAT SIGNATURE 



PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 



PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOSYNT. 



PD02181D 12.85 8.609e- 
09 36-64 



Actinin-type actin- 
binding domain proteins 



BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.5l0e-ll 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 



Trp-Asp (WD) repeat 



BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins . 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BLO1085 


Ribulose-phosphate 3- 
epiraerase family- 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL01085 


Ribulose-phosphate 3- 
epimerase family 
proteins. 


BL01085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 | 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-~ 
10 5-49 


940 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINB PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


948 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- | 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTBRGENIC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins . 


BL00379 24.64 1.6l0e- 
15 111-148 


959 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins . 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PRO 05 02 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.671e-09 38-53 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-18 506- 
526 PF01008A 20-14 
5.875e-15 369-390 
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RESULTS* 



970 



BL01277 



975 



BL01159 



977 



PF00791 



978 
979 



BL01167 
BL00478 



980 



PR00312 



981 



PF00992 



982 



PRO 02 9 9 



983 



BL01150 



Ribonuc lease PH 
proteins . 



BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 



WW/rsp5/WWP domain 
proteins . 



Domain present in ZO-1 
and Unc5-like netrin 

r eceptors . 

Ribosomal protein L17 
proteins . 

LIM domain proteins 



BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 

186 

PF00791C 20.98 2.235e- 
09 55-94 



BL01167B 20.66 8.258e- 

19 B8-127 

BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 



CALSEQUESTRIN SIGNATURE 



Troponin 



PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.09 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 
PF00992A 16.67 8.816e- 
09 414-449 



ALPHA CRYSTALLIN 

SIGNATURE 

Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 



PR00299F 13.20 2.367e- 
09 127-149 



BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 



986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C | 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 j 


987 


BL00939 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 | 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 I 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- j 
11 497-513 I 


994 


BL00027 


1 Horaeobox ' doma i n 
proteins . 


BL00027 26.43 2.500e- j 
25 146-189 | 


997 


BL01304 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- | 
11 65-79 j 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 j 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOOe- 
15 11-25 PR00926F 
17. 7S 5.565e-09 120- 
143 


1005 


BL00406 


Ac tins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 l.OOOe-40 147-202 
BL00406D 12.58 3 . 700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 l.OOOe-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9. 95 3 .348e-29 11-46 


1007 

• 


PR00304 


TAILLESS COMPLEX 
POLYPEPTIDE 1 
(CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 9B-118 
PR00304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD0116B 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohistidine 
proteins. 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75* 8.062e-10 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PRO 04 13 


HALOACID 

DEHALOGENASB/EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0X066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiqui tin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR0O970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143e- 
20 56-78 PR00970D 
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SEQ ID NO: 



1042 



1043 



1045 



1046 



1047 

1049 
1050 



1054 



1055 



1058 



1060 
1064 
T065 

1066 



1071 
1072 
1075 



1077 



1078 



ACCESSION 
NO. 



BL00678 



PR00048 



BL00615 



BL01092 



BL01216 

DM00031 
BL01073 



BL00571 



BIj00030 



BL00223 



BL00027 
BL00455 
PR00019 

PR00326 



PD02870 
PF00856 
BL01009 



DESCRIPTION 



SIGNATURE 



PR00724 



BL00215 



Trp-Asp <WD) repeat 
prote ins proteins 



C2H2-TYPE ZINC FINGER 
SIGNATURE 



RESULTS 1 

9.96 2.l54e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-l5 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 



BL00678 9.67 2.200e-10 
243-254 



PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 



C-type lectin domain 
proteins. 



Adenylate cyclases 
class- I proteins. 
ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins 
IMMUNOGLOBULIN V REGION. 

Ribosomal protein L24e 
proteins. 



BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 

BL01092N 13.54 B.924e- 

10 3-40 

BL01216D 21.75 4.3l6e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 



DM00031B 15.41 7.618e- 
12 102-136 



BL01073 24.30 l.OOOe- 
40 12-62 



Amidaees proteins. 



BL00571 25.69 5. 8/be- 
31 160-212 



Eukaryotic RNA-binding 
region RNP-1 proteins. 

Annexins repeat proteins 
domain proteins. 



BL00030A 14.39 5.235e- 
11 98-117 BL00030B 
7.03 4.3l6e-09 137-147 
BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 



• Homeobox ' domain 
proteins . 

Putative AMP -binding 
domain proteins 
LEUCINE-RICH REPEAT 
SIGNATURE 

GTPl/OBG GTP-BIJNDUUU 
PROTEIN FAMILY SIGNATURE 



BL00027 26.43 3.455e- 

35 158-201 

BL00455 13.31 6.21ie- 

13 280-296 

PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.8S0e-09 87-101 



1079 



BL00678 



RECEPTOR I NT ERLEUKI N - 1 
PRECURSOR. 
SET domain proteins 

Extracellular proteins 
SCP/Tpx- 1 /Ag5/ PR- 1 /Sc7 
proteins . 



CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMIL Y SIGNATURE 
Mitochondrial energy 
transfer proteins. 



Trp-Asp (WD) repeat 



PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14- 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 



PD02870B 18.83 8.518e- 

11 164-197 

PF00856A 26.14 5.9/be- 

09 350-387 

BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 



PR00724A 10.91 l.OOOe- 
08 366-379 

BL00215A 15.82 l.uuue- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 



BL00678 9.67 4.31be-u* 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BL00460 


Glutathione peroxidases 
selenocysteine proteins . 

• 


BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 

• 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02B11C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6 . 143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.857e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BL00170 


Cy c lophi 1 in - type 
peptidyl-prolyl cis- 
trans isomer ase 
signatur . 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins. 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 

* 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: [ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL01115 


GTP-Dinding nuclear 
protein ran proteins. 


Rli011l5A 10 22 6.364e-~ 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
oinaing region proteins . 


BL00107A 18.39 4.000e- 
19 451-4B2 BL00107B 
13.31 3.077e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB. 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


28 81-127 PD02894B 
17 Q7 1 1flftp»-77 178- 

Xj • 7J 1 > iOOC X/O 

211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


nr nnoiT? m no ~\ ^31e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMLKAbli 
ENDONUCLE AS E DNA- . 


Dnm Q"* 7 A fi 68 3 475e- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


ppnnfi94D Tl 94 7 455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 

• 

L * 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PROUJdtOA lb . lt t l./y*e- 
10 205-220 PR00320C 
13. Ul /.ofltUQ— xu «£us 
220 PR00320B 12.19 

PR00320A 16.74 7.146e- 
09 35-50 PR00320B 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


DJjUU r £US xO.j/ i.xujc 

18 1089-1113 


1185 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


" BL00983C 12.69 2.761e- 

in 77 - Q7 

XW If a3 


1188 


BL00878 


Om/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 

<j/ie ■Rl\nn.R7ftF 19 67 
/L ft D DJJUUO / or x j • u / 

3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 

* 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PR0034 5 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.8O0e- 
28 72-101 PR00345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 


1194 

• 

■ 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13.46 
5.545e-l6 79-98 


1195 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6.738e- 
11 15-47 


1197 


BL01298 


Dihydrodipicolinate 

reductase proteins. 


BL01298A 13.90 5.959e- 
09 51-73 


1203 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA - LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


1206 


BL01183 

f 


ubiE/COQ5 

methyltransferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PF00023 

i 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PFO0023B 
14.20 1.8l8e-09 45-55 


1212 


PR00048 

• 


C2H2-TYPE ZINC FINGER 
SIGNATURE 

* 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720&- 
10 20-42 PR00450C 
.12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PRO 04 56 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC -FINGER 
METAL -BIND I . 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile . 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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SEQ ID NO 



1230 
1231 
1232 



1233 
T235" 



1237 



1243 
1246 



1249 



1254 



1255 
1256 

1258 



1259 



ACCESSION 
NO. 



DESCRIPTION 



BL01160 
PR00735 
PR00497 



PR00497 
BL00866 

BL00027 



PR00403 
PD01168 



BL00018 



BL001B3 



BL01115 
BL00373 

PR00011 



BL00518 



RESULTS* 



1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 



Kinesin light chain 
repeat proteins . 
GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 
NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 



BL01160B 19.54 8.297e« 

10 6-60 

PR00735A 11.19 6.8S7e- 

09 391-405 
PR00497A 6.92 5.553e- 

10 158-176 



NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 
Carbamoyl -phosphate 
synthase subdomain 
proteins . 



PR00497A 6.92 5.553e- 

10 158-176 

BL00866B 36.29 2.776e- 
09 75-121 



'Homeobox* domain 

protei ns . 

WW DOMAIN SIGNATURE 



BL00027 26.43 1.818e- 
21 36-79 



SYNTHETASE LIGASE 
PROTEIN ALANYL. 



PR00403B 12.19 1.184e- 

11 10-25 

PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 



EF-hand calcium-binding 
domain proteins. 



BL00018 7.41 2.800e-10 
183-196 



Ubiquit in- conjugating 
enzymes proteins. 



BL00183 28.97 2.440e- 
36 96-144 



GTP-binding nuclear 
protein ran proteins 
Phosphor ibo sy lg lyc inami< 
e formyl transferase 
proteins . 
TYPE III EGF^L 1 KB 
SIGNATURE 



BL01115A 10.22 5.670e 

11 8-52 - 

BL00373C 10.35 3.348e- 

12 143-156 

PR00011B 13.08 3.217e- 
10 174-193 



Zinc finger, C3HC4 type 
(RING finger) , proteins 



BL00518 12.23 8.2B6e- 
10 31-40 



1261 



1262 



1263 



1264 



1266 



1269 



1270 



1275 
1276 



PRO 00 70 



BL00462 



BL00038 



BL01115 



PR00837 



PR00449 



BL00276 



PD02327 
PR00412 



DIHYDROFOLATE REDUCTASE 
SIGNATURE 



Gamma - 
glutamyl transpeptidase 

proteins. 



PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9-500e-15 51-63 
PR00070A 12.92 5.500e- 

12 16-27 

BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 



Myc-type, ' helix- loop- 
helix ' diraer i zation 
domain proteins. 



GTP-binding nuclear 
protein ran proteins 



ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



Channel forming colic ins 
proteins . 



BL00038B 16.97 9.455e 
11 62-83 



BL01115A 10.22 5.670e- 
11 17-61 



PR00837C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e~12 86-105 
PR00B37D 11.12 7.577e- 
12 201-215 



PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 



GLYCOPROTEIN ANTIGEN 
PREC URSOR IMMUNOGL O . 
EPOXIDE HYDROLASE 



BL00276A 8.87 l.SOOe- 
09 17-29 



PD02327C 15.47 9.769e 

09 228-243 

PR00412B 12.59 7.894e 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 

• • 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain psesent in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.6l0e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-263 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 

« 

.■ 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12.16 1.900e- 
11 15-28 


1314 


3L00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 

• 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteins. 


BL00783C 22.43 6.559e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta -catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 7..239e- 
09 25-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B- TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 


PR00700 


PROTBIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-2B1 


1337 


PR00700 


PROTEIN TYROSINE 


PR00700D 12.47 2.200e- 
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<?FO TD NO • 


appp e e TON 

rW>^£iOQ lull 

NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE ] 


09 211-230 


1340 


PR00860 


VERTEBRATE 

ME TALLOTH I ONE I N 

SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL0QB93 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINB . 


DM00099B 14.73 8.313e- 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PP00651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


imvrrxc? tivt uctvtjv ftXTVTM 
MYOblW JiCiAvi UtiAIN 

SIGNATURE 


32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15 41 1 783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13 54 3 408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 

• 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303A 21.77 6.667e- 

26 45-82 BL00303B 

26 15 1 0Q0e-24 93-130 


1355 


BL00039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 

DUUUV J J Urn J m \J mJ ^ * \J VJ \J w 

18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2. 216e- 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 

£i X Vi *~ P XlSKjCiK. ITlCtXrvLJ — 

BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11 68 3.314e-25 249- 
274 BL01272A 6.49 
1.231e-18 99-117 


1363 


BL01272 


Glucokinase regulatory 

prOLclIl i.a\\iJ.xy piuLciuo • 


BL01272B 19.61 6.670e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.231e-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 | PR0O988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



13 71 
1372 



BL00242 



10 1-19 



Integrins alpha chain 
proteins . 



BL00242B 8.13 
09 469-479 



8.615e- 



1373 



PR00625 



BL00434 



DNAJ PROTEIN FAMILY 
SIGNATURE 



PR00625B 13.48 7.353e- 
19 46-67 PR00625A 
12.84 1.391e-16 14-34 



HSF-type DNA-binding 
domain proteins . ' 



BL00434C 23.85 3.778e- 
09 90-130 



1374 



13 75 
1376 



1380 



1381 



PR00962 



PD02475 



PD01066 



BL00194 



DM01970 



LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 



PR00962C 8.00 6.337e- 
09 505-526 



MUCIN EPITHELIAL TUMOR- 
ASSOCIATE. 



PD02475A 23.18 8.552e- 
10 1111-1150 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 



PD01066 19.43 
32 24-63 



9.571e- 



Thioredoxin family 
proteins. 



BL00194 12.16 8.333e- 
12 48-61 



0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 



DM01970B 8.60 1.458e- 
15 1123-1136 



1383 



1384 



BL00678 



BL00678 



Trp-Asp (WD) repeat 
proteins proteins. 



BL00678 9.67 
243-254 



7.600e-10 



Trp-Asp (WD) repeat 
proteins proteins. 



BL00678 
271-282 



9.67 7.600e-10 



1385 



1386 



1387 



1389 



1390 



1392 



1393 



1394 



BL00303 



BL01160 



BL00518 



PD01066 



PD01066 



PR00308 



PR00380 



PD00066 



S-100/ICaBP type calcium 
binding protein. 



BL00303B 26.15 6.203e 
10 95-132 



Kinesin light chain 
repeat proteins . 



Zinc finger, C3HC4 type 
(RING finger) , proteins 



BL01160B 19.54 5.042e 

09 1574-1628 

BL00518 12.23 l.OOOe- 
11 52-61 



PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 



PD01066 19.43 3.600e- 
30 10-49 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 



PD01066 19.43 3.512e- 
31 32-71 



TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 



PR00308C 3.83 
10 127-137 



9.723e- 



KINESIN HEAVY CHAIN 
SIGNATURE 



PR00380A 14.18 9.625e- 
25 8B-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13.18 6.538e-16 243- 
262 



PROTEIN ZINC- FINGER 
METAL- BINDI . 



PD00066 13.92 3.400e- 
14 462-475 PD00066 
13 .92 8 .800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 • 
PD00066 13.92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 



1398 



1400 



1406 



PD01066 



DM01206 



PD00930 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 



CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 



PD01066 19.43 6.786e- 
32 10-49 

DM01206B 10.69 7.038e 
09 270-290 



PROTEIN GTPASE DOMAIN 
ACTIVATION. 



PD00930A 25.62 7.324e 
15 363-389 



1407 



1408 



BLD0030 



PR00019 



Eukaryotic RNA-binding 
region RNP-l proteins. 



BL00030A 14.39 7.500e 
10 457-476 



LEUCINE- RICH REPEAT 
SIGNATURE 



PR00019A 11.19 9.550e 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 1 
NO. | 


DESCRIPTION 


RESULTS* 








09 1/b - iyu 


1409 


PR00510 | 


NEBULIN SIGNATURE 


PR00510A 9.09 4.l50e- 

Xc. loz-^U<c IrKUUblUJJ 

12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 j 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins . 


BL00358B 22.75 l.OOOe- 
40 57-103 BL00358C 
13.75 b.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BLQU3boA lo . Ub 1 . yjie- 
11 33-44 


1414 


BL00282 j 


Kazal serine protease 
inhibitors family 
protexns . 


BL002bz Ib.tio / .iioe- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 4.300e- 


1417 


PR00681 j 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 ] 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1 . 571e- 
09 428-443 


1420 


PD01941 j 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-3Q 400- 
447 PD01941E 15.92 

a a r" ft ft ft ^ *"*T ft /•* >1 

2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PDU1941L) 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-l5 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
1 FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 

4 m * ft ft fl ^ ft ft ft 

11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 Bli50002A 
14.19 9.250e-12 298- 
31/ bLibUUU/iii x^.Ij 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF0062B 


j PHD- finger. 


FrUUb-iO JLD.o^c 5.U4ac" 

12 330-345 


142S 


PF00628 


PHD- finger. 


PF00628 15.84 3.045e- 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 

17.71 4.306e-14 262- 
282 


1428 


BL0O039 


DEAD -box subfamily ATP- 
dependent helicases 

1 ynt" pi r>e 


BL00039D 21.67 5.2l9e- 
34 147-193 


1429 


PR00320 


G- PROTEIN BETA WD- 40 
j REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR0O378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR00928 


| GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0O319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
; 09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCULIN SIGNATURE 


PR0O806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 

- 


PD01841 

■ 


PHOSPHOR YLASE KINASE 
ALPHA MUSCL. 

- 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS hi st one -family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR0004B . 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 R I BONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyltransf erase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins . 


BL00545C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PFO0686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO 



1477 



1478 



1479 



1480 

1481 
1482 

1483 



1485 



1486 



1488 



1490 



1491 



1492 
1497 



1500 



1502 



1503 



1505 



1506 



1512 



1516 



1518 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS 1 



PF00566 



BL00030 



DM00406 



BL0 0290 

PRO 0150 
PF0O780 

BL0116 0 
PD010I 



BIi00107 



BL0O039 



BL00166 



BL00452 



PR00019 
BL00107 



PF00876 



BL00027 



BL00027 



BL01177 



Probable rabGAP domain 
proteins . 



PF00566A 12.64 7.333e- 
10 466-476 



BL00972 



Eukaryotic RNA-binding 
region RNP-1 proteins . 



BL00030B 7.03 9.400e- 
10 43-53 



GLIADIN . 



BL00523 



DM00406 7.73 8.541e-10 
292-305 



BL00914 



BL00600 



Immunoglobulins and 
major histocompatibility 
complex proteins. 
PHOS PHOENOLP YRUVATE 
CARBOXY LASE SIGNATURE 
Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM . 
Kinesin light chain 
repe at proteins. 
PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 

BINDING NU. 

Protein kinases ATP- 
binding region proteins 
DEAD- box subfamily ATP- 
dependent hel leases 
proteins . 



BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 
PR00150F 10.45 9.039e- 
09 21-51 



PF00780I 14.69 4.825e- 
09 107-137 

BL01160B 19.54 1.153e~ 

09 108-162 

PD01066 19.43 579096 - 
25 17-56 



BL00107B 13.31 1.529e- 
09 34-50 



BL00039D 21.67 9.586e 
10 116-162 



Enoyl-CoA 

hydr ata s e / i somer a se 
proteins . 



BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 



Guanylate cyclases 
proteins . 



BL00452D 28.59 3.70Ue- 
31 63-106 BL00452E 
11.92 3 .045e-13 115- 
131 



LEUCINE-RICH REPEAT 

SIGNATURE 

Protein kinases ATP- 
binding region proteins 



PR00019A 11.19 3.667e- 

09 532-546 

BL00107B 13.31 l.OOUe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 



Ogre family. 



PF00876E 7.99 1.947e- 
10 107-117 



' Homeobox ' domain 
proteins . 



BL00027 26.43 4.789e- 
24 112-155 



'Homeobox 1 domaxn 
proteins . 



BL00027 26.43 4.789e- 
24 112-155 



Anaphylatoxin domain 
proteins. 



BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 



Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 



Sulfatases proteins 



Syntaxin / epimorphin" 
family proteins 



Aminotransferases class- 
III pyridoxal -phosphate 
attachment si. 



BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.759e- 

10 341-363 

BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 
BL00914 24.91 7.045e- 

1 4 168-218 

BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BL00600G 12.43 
9.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-12 190- 
206 BL00600F 8.77 
1.000e-ll 343-356 
BL00600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


153 8 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 

- 

r 


PR00965 

* 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 

- 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PRO0965G 8.52 2.440e- 
27 258-279 PR00965B 
4 . 80 8 .650e-26 88-109 
PR00965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 9 . 719e- 
17 163-207 


1543 

* 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6 . 143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin-activating 
en2yme proteins. 


BL00536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAGINASE /GLUTAM INASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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SEQ ID NO: | ACCESSION 

NO. 

"1556 | BL00061 



1557 



BL01228 



1558 



BL01228 



1559 



BL01228 



1562 



BL00522 



T563~ 



PF00651 



1564 



BL00299 



1566 



BL01013 



1567 



BL00678 



1570 



BL00479 



1576 



PR00665 



1577 



1579 
1580 

1581 



1582 



1584 



1585 



1586 



DM00099 



BL00524 



PD02894 



BL00411 



PR00604 



DESCRIPTION 

Short- chain 
dehydrogenases /reductase 

s family proteins 



RESULTS* 

BL00061B 25.79 6.276e- 
13 67-105 



Hypothetical cof family 
proteins . 



BL01228D 17.44 8.105e- 
12 107-132 



Hypothetical co 
proteins 



f family 



BL01228D 17.44 8 . 105e- 
12 107-132 



PF00651 



Hypothetical cof family 
proteins 



BL01228D 17.44 8.105e- 
12 107-132 



DM01551 



DM01354 



DNA polymerase family X 
proteins . 



BTB (also Jcnown as BR- 
C/Ttk) domain proteins 



BL00522C 11.90 6 . 600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.l23e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 

575 

PF00651 15.00 1.947e- 
11 46-59 



Ubiquitin domain 
proteins . 



BL00299 28.84 2.823e- 
10 324-376 



Oxysterol -binding 
protein family proteins 



BL01013D 26.81 8 . 594e- 
17 184-228 BL01013C 
9.97 4.906e-12 14-24 



Trp-Asp (WD) repeat 
proteins proteins . 



BL00678 9.67 3.400e-10 
378-389 BL00678 9.67 
5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 



Phorbol esters J ~~ 
diacylglycerol binding 
domain proteins. 



OXYTOCIN RECEPTOR 
SIGNATURE 



4 lew A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 



Somatomedin B domain 
proteins . 
HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 



BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 

12 173-189 

PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 

15 11-25 

DM00099B 14.73 9.308e~ 
10 127-137 



Kinesin motor 
proteins . 



lomam 



CLASS IA AND IB 
CYTOCHROME C SIGNATURE 



BTB (also known as BR- 
C/Ttk) domain proteins . 



kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 



kw TRANSCRIPTASE REVERSE 
II 0RF2. 



BL00524A 9.65 6.776e- 

14 52-73 

PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 
BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 

276 

PR00604A 11.13 2.440e- 
09 79-87 



PF00651 15.00 l.OOOe- 
10 225-238 



DM01551C 14.62 9.455e- 
11 125-145 



DM01354S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PRO 00 72 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.2B6e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.7l6e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA-binding domain 
proteins repeat proteins 
proteins . 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL0 00 28 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF00628 


PHD-f inger . 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PRO 00 14 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1600 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins . 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 

• 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168' 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1613 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 

METH YLTRANS FERASE BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-l8 439- 
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SEQ ID NO: | ACCESSION 

NO. 



1616 



BL00115 



1617 



BL00303 



1618 



BL01254 



1619 



PD01888 



1621 



PR00239 



1622 



PR00860 



1624 



PR00784 



1626 



1631 



BL00325 



BL00064 



1632 



PR00063 



1634 



PR00239 



1636 



BL01210 



1637 



BL00982 



DESCRIPTION 



RESULTS* 



472 



Eukaryotic RNA 
polymerase II 
hep tapep tide repeat 
proteins 



BL00115Z 3.12 7.4B5e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 



S-100/ICaBP type calcium 
binding protein. 



BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 



Fetuin family proteins 



BL01254F 10.02 8.754e- 
09 137-147 



PEPTIDE REDUCTASE 
PROTEIN METHI . 



MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 



PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 

8.800e-15 7-23 

PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 



VERTEBRATE 
METALLOTHI ONE IN 
SIGNATURE 



PR00860B 7.04 1.900e- 
18 27-41 PRO0860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 



MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 



PR00784D 15.86 8.027e- 
11 77-95 



Actin-depol ymer i z i ng 
proteins . 

L- lactate dehydrogenase 
proteins . 



BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 
BL00064B 23.57 l.OODe- 
40 82-130 BL00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1.000e-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D. 14.19 6.£00e- 
31 182-212 



RIBOSOMAL PROTEIN L27 
SIGNATURE 



MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 



Caveolins proteins 



Bacterial -type phytoene 
dehydrogenase proteins. 



PR00063B 15.24 9.700e 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 



PR00239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 



BL01210B 13.92 9.531e- 
10 133-183 



BL00982A 18.41 5.388e- 
11 11-43 



1639 



BL01183 



1640 



PRO 00 15 



1641 



PR00320 



ubiE/COQ5 
methyltransferase family 

proteins . _____ 



BL01183B 21.31 8.144e< 
12 132-177 



GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 



PR00015B 9.84 8.468e- 
10 128-149 



G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 



PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e 
10 364-379 PR00320B 
12.19 5.1l4e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 

* 


RESULTS* 








; PR00320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e T 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6. 30Be-18 '386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


1647 

• 


DM01242 


3 THREONINE- -TRNA 
LIGASE . 

• 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
18 265-314 DM01242F 
10.61 7.6l8e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.720e- 
11 431-485 


1652 


BL00933 


FGGY family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


Involucrin proteins. 


BL00795C 17.06 2.988e- 
10 70-115 


1654 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 


1655 ; 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 282-314 


1656 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7. 93 Be- 
ll 114-136 


1658 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Actins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1661 


PR00105 


CYTOSINE-SPECIFIC DNA 
M ETHYLTRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-l0 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 5.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLl/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 115-141 BL01153C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins . 


BL00598 14.45 8.50Qe- 
20 27-49 


1673 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PR00049D 
0.00 1.286e-10 342-357 


1676 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

■ 


PR00747H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7. 500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PRC 0747 

1 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

■ 

* 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-l8 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL00678 


Trp-Aep (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR0O456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.l25e- 
10 420-435 


1692 


PRO 0456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL0 0674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 




• 




4.46 4.000e-23 241-263 
BLO0674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16 .07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 

> 

• 

* 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13 .85 5.408e-10 613- 

* 

628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BLO0353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL00038 


Myc-type, 'helix-loop- 
helix 1 dimeri za t ion 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


1724 


BL01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyltransf erase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 " BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins . 


BL00594A 16.75 1.089e- 
09 17-61 
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[ SEQ ID NO: j 


ACCESSION 
NO. 


DESCRIPTION j 


RESULTS* j 


1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 296-350 I 


1732 


BL01160 T 


Kinesin light chain | 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00B50 T 


Histone deacetylase j 
family . 


PF00850F 15.70 4.34i?e- j 
22 246-279 PFD0850D j 
14.76 6.850e-20 177- 
201 JrA*U0obUr» b.oo 

8.691e-18 209-235 

ftUUobUo ZZ . /b Tk | 

14 281-323 I 


1734 


BL00354 


HMG-I and HMG-Y DNA- 1 
binding domain proteins 
(Ahook) . 1 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


W KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 b.^bJe- 

10 492-502 I 


1743 


PR00449 j 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D j 
10.79 2.241e-10 109- 
123 PR00449E. 13. bU j 
9.289e-10 144-167 


1744 


PR00449 ~T 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- J 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 j 


1745 


BL00720 [ 


Guanine -nucleotide « 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 1 
15 136-160 


1746 


PR000B1 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.72 /e- | 
11 45-57 PR00081E | 
17.54 3.93be-±U 1DU- | 
168 j 


1747 


| BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
1 13.40 2.895e-12 3-14 


1749 


PR00819 


| CBXX/CFQX SUP ERFAM I LY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 j 


1751 


PD00066 

1 ' 


PROTEIN ZINC-FINGHK 
METAL -BIND I. 


PD00066 13.92 3.4Q0e- 
14 33-46 PD00066 j 
1 13.92 l.,000e-13 89-102 
1 PD00066 13.92 7.000e- 
13 61-74 PD00066 j 
13.92 6.571e-12 117- 
1 13 0 J 


1753 


BL01013 


1 Oxysterol -binding 

j protein family proteins. 


I BL01013D 2b. 81 b. b-Lbe- j 
18 33-77 1 


1754 

j 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BLiUU fy\Jx 1 
20.01 2.821e-09 60-91 j 
BLOO/yUJL 2U.UJ. O.Jj/c" 
1 09 287-318 | 


1756 


| PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-O9 

b D J O Do | 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- j 
09 224-278 j 


1765 


PR00326 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR0Q32oA o. /b b.jDue- 

11 146-167 1 


1775 


PF00023 


AIIK repeat. pjtytcniB . 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family or 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

as corba t e - dependent 

monooxygenases proteins . 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134S-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 

- 


BL01013 


Oxysterol -binding 
protein family proteins . 

• 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.881e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 

ii una /ion 


17B3 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL0O741 

• 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; post ion of 
signature in amino acid sequence. 
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TABLE 4 



SEQ ID 
NO: 



10 



12 



14 



17 



18 



20 



PFAM NAME 



pkinase 



zf-C2H2 



in3 



fn3 



fn3 



fn3 
TBC 



p450 



a nk 



zf-MYND 
zf-MYND 



zf-C2H2 



DESCRIPTION 



2.1e-32 
1.3e-29 



TBC domain 



4e-40 



Cytochrome P450 



9.5e-l7 



Auk repeat 



6e-20 



Immunoglobulin domain 



l.7e-05 



MYND finger 
MYND finger 



1.3e-06 
1.3e-06 



Zinc finger, C2H2 type 



1.7e-99 



CAP GLY 



CAP-Gly domain 



1.2e-25 



IMPDH C 



IMP dehydrogenase / GMP 
reductase C terminus 



1.6e-119 



PFAM 
SCORE 
5 



109 
110.7 



1097.1 



1035.0 
1090.4 



1097.1 
146.7 



62.0 
79.7 



35.4 



343.9 



410.5 



21 


IMPDH_C i 


IMP aenyQrogenase / wir 

■v»/=i /flt ■« /it-is ao f t<armi Him 

iSQUCCoSc v_ tciiiuuuo i 


4 ,3e-102 


352.6 


22 | 


pkinase 


iiUKaryouic p^ol-gxh jviuaoc i 
domain j 


2 .4e-79 


277.0 


23 j 


pkinase 1 




8 .4e-74 


258.6 


25 | 


KJN A__p OX r\ \ 


phi ?v nnlvmerase alpha subunit i 


0 


1077.7 


26 | 


pi „ 1 


PI rr rlnma i Tl 


1 ,9e-10 


44.4 


27 


D i V»r*cjrtfTi2 1 T ! 
J 1 


Pibo<3oraal protein L23 


7.8e-32 


111.2 


1 28 i 


Ribosomal_L2 


Ribosomal protein L23 j 


le-29 


104.2 


I 


7f - A?r> " 1 

Z J. A£ U 1 


A20-like zinc finqer 


1.5e-10 


48.5 j 


•31 i 
I 


^ J. U j 


A20-like zinc finger ) 


1.5e-10 


48.5 J 




f vJLLl 1 


FMN- dependent dehydrogenase 


5.4e-179 


608.1 j 


1 A 

1 


PTD i 

K XU 1 


Phospho tyrosine interaction ! 
domain (PTB/PID) J 


3 .8e-59 


209.9 


35 


ig I 


Immunoglobulin domain J 


1.4e-13 


48.8 ! 


i 36 


ig 1 


Immunoglobulin domain J 


1 .4e-13 


48.8 j 


1 40 i 


kinesin J 


Kinesin motor domain 


6 .-7e-76 


265.6 


| 44 


Ets | 


Ets- domain 


1.4e-56 


182.1 j 


45 


Ets j 


Ets -domain 


1.4e-56 


182.1 | 


1 46 


LRR j 


Leucine Rich Repeat 


1.7e-13 


58.3 1 
552.8 1 


48 


z£-C2H2 


Zinc finger, C2H2 type 


2.3e-162 




49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 j 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
1 hydrolase family 


l.le-26 


102.0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
1 hydrolase family 


l.le-26 


102.0 


1 52 


| ras 


1 Ras family 


1 6.5e-45 


162.3 


1 53 


PRK 


Phosphoribulokinase 


2.1e-65 


230.7 


! 54 


myb__DNA- 
binding 


Myb-like DNA-binding domain 


I 0.096 


15.2 


i 55 


voltage CLC 


j Voltage gated chloride channels 


j 3.3e-186 


631.9 


1 56 


sugar tr 


j Sugar (and other) transporter 


| 0.00015 


-64.3 


i 57 


TBC 


j TBC domain 


j 2.2e-37 


137.6 


58 


ank 


1 Ank repeat 


j 5.9e-25 


96.3 


| 59 


ank 


j Ank repeat 


j 5.9e-25 


96.3 


i 67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


1 68 


| C2 


| C2 domain 


j 7.9e-54 


192 .2 


1 69 


C2 


C2 domain 


j 2.3e-54 


194 .0 




Kelch 


Kelch motif 


j 9.4e-99 


341.5 


1 72 


ig 


Immunoglobulin domain 


| 8.2e-28 


94 .7 


1 73 


pkinase 


Eukaryotic protein kinase 


| 8e-69 


242 .1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






domain 






74 

* 


pkinase 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


76 


C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5 .4e~54 


192 .8 


83 


Peptidase_S9 


Prolyl oligopeptidase family 


4.3e-l0 


36.8 


84 


fn3 


Fibronectin type III domain 


4 .le-51 


183 .2 


86 


SH2 


Src homology domain 2 


3.1e-22 


67.7 


88 




Immunoglobulin domain 


0.0091 


14 .0 


89 


WD40 


WD domain, G-beta repeat 


2.1e-21 


84.6 


92 


laminin_G 


Laminin G domain 


6.1e-27 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183.9 


97 


adh_short 


short chain dehydrogenase 


2e-61 


217.5 


98 


kinesin 


Kinesin motor domain 


2.2e-86 


300.4 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133.0 


102 


AAA 


ATPases associated with various 
cellular act 


6.8e-05 


-5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2.7e-73 


256.9 


106 


ras 


Ras family 


8.3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


5.4e-27 


100.7 


108 


Cyt_reductas 
e 


FAD/NAD- binding Cytochrome 
reductase 


7.7e-61 


215.5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


116 


PH 


PH domain 


3.1e-ll 


45.2 


117 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


2.4e-14 


53.5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4 .5e-20 


76.3 


120 


WD40 


WD domain, G-beta repeat 


2 .4e-14 


61.1 | 


121 


WD40 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


123 


IF5 eIF4 elF 
2 


eIF4-gamma/elF5/eIF2-epsilon 

• 


le-32 


122.2 


124 


ig 


Immunoglobulin domain 


6.5e-08 


30. £ 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-16 


58.6 


128 


PP2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 j 


129 


ATP1G1 PLM M 
AT 8 


ATP1G1/PLM/MAT8 family 


3 .le-20 


80.6 


130 


pfkB 


pfkB family carbohydrate kinase 


4 .5e-42 


137.1 | 


133 


ACBP 


Acyl CoA binding protein 


4.6e-22 


86.7 1 


134 


rrm 


RNA recognition motif. 


1.2e-31 


118.5 J 


13 5 


IQ 


IQ calmodulin- binding motif 


2 .6e-08 


41 .0 j 


136 


ATP1G1_PLM_M 
ATS 


ATP1G1/PLM/MAT8 family 


9.3e-22 


85.7 j 


139 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.0067 


23 .1 j 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-82 


287.5 j 


141 


Peptidase S2 
6 


Signal peptidase I 


5.7e-10 


35.7 


143 


arf 


ADP-ribosylation factor family 


1.2e-39 


145.2 | 


146 


KRAB 


KRAB box 


7.3e-30 


112.6 j 


148 


DUF6 


Integral membrane protein DUF6 


0.096 


8.0 | 


149 


PDEase 


3' 5' -cyclic nucleotide 
phosphodiesterase 


3 . 8e-80 


231.1 1 


151 


S4 


S4 domain 


l.le-08 


42.3 | 


153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3 .8e-103 


356.1 | 


154 


Cyt_reductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-60 


212.2 j 


155 


ras 


Ras ramily 


3.6e-28 


107.0 


157 


actin 


Actin 


3 .8e-26 


87.1 
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I SEQ ID 
1 NO: 


PFAM NAME 


DESCRIPTION ~T 


p -value 


PFAM n 

SCORE j 


1 158 j 


Jacalin 


Jacalin- like lectin domain j 


0.09 


-24.9 ] 


160 


Zn_carbOpept 


Zinc carooxypepniaase j 


5e- 138 


471.9 


165 


pkinase 


Eukaryotic procein Kinase 
domain 


5 le-67 


236.1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


168 


Ribosomal_Sl 

i 5 


Ribosomal protein S15 


l.le-06 


29.0 | 


j 169 


j DEAD 


DEAD/DEAH box helicase 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUF59 


0.07 
3.7e-15 


-17.4 
58.6 



175 



178 



globin 

w 



domain 
Globin 
WW domain 



ras 



ATP1G1 
AT 8 



PL»M M 



Ras family 

ATP1G1/PLM/MAT8 tamliy 



4.6e-18 
7.3e-0 6 
le-31 



2.5e-l7 



67.4 
32.9 



118.8 



71.0 



179 


2f-C2H2 


Zinc finger, C2H2 type j 


1.5e-99 | 


344.2 j 


180 j 


Clq T 


Clq domain \ 


B.8e-72 j 


251.9 | 

rm. y«- n f\ t 


190 


Yjphosphatas 
e 1 


Protein- tyrosine phosphatase j 


4.9e-287 


967 . 0 


191 1 


ef hand | 


EF hand 1 


7.5e-16 j 


66.1 1 


193 


pkinase 


Eukaryotic protein kinase T 
domain 


6.5e-82 j 


285.6 

i n n A \ 


194 j 


bromodomain J 


Bromodomain 1 


5.8e-31 j 

0*k !■» V A 


111.4 J 


195 | 


PALP 1 


Pyridoxal -phosphate dependent 
enzyme j 


2.5e-64 


227 . 1 | 


197 1 


DnaJ 1 


DnaJ domain | 


1.6e-38 


141.4 


199 


RrnaAD | 


Ribosomal RNA adenine ~7 
dimethylases | 


0.00018 j 

A. mmr m* f\ jj 


16.9 


200 


acid_j phospha ! 
t 1 


Histidine acid phosphatase j 


2.5e-10 


37.2 


201 


WH2 j 


Wiskott Aldrich syndrome | 
homology region 2 1 


A rt f\ f\ A Q I 

0.QUU4O 1 


AO .~> | 


204 j 


vATP- \ 
synt AC39 


ATP synthase (C/AC39) subunit j 






205 


vATP- 1 
synt_AC39 ' 


ATP synthase (C/AC3 9) suounit i 




476 9 i 


206 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


2.4e-25 


97.6 


j 209 


ank 


Ank repeat 


i 1.4e-19 


78.4 j 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 j 


j 211 


Clq 


Clq domain 


| 1.6e-70 


j 247.7 j 


212 


XJQ con 


1 Ubiquitin-conjugatmg enzyme 


j 7.4e-74 


258.8 j 


213 


UQ_con 


j Ubi qui tin- conjugating enzyme 


le-53 


191.9 j 


215 


DEAD 


\ DEAD/DEAH box helicase 


1.8e-43 


1 1 40 - 4 i 

83 .4 1 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 




j 218 


Glycos trans 
f 2 


Glycosyl transferases 


4e-21 


83.6 


| 219 


ig 


j immunoglobulin domain 


0.092 


1 10 - 7 1 


222 


WD4 0 


1 WD domain, G-beta repeat 


7.4e-23 


1 89 - 4 ! 


j 224 


TPR 


| TPR Domain 


1.2e-08 


j 42 .1 1 


225 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats ) 


1.5e-38 


141.5 


["226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


| 229 


HSP70 


I Hsp70 protein 


2 .4e-54 


1 194.0 


j 230 


GSHPx 


1 Glutathione peroxidases 


3 .4e-47 


170.2 


j 231 


ts P— L 


1 Thrombospondin type l domain 


0.0075 


j 17.1 


233 


cyclin 


| Cyclin 


4.6e-144 


1 492.0 


j 234 


ras 


j Ras family 


4.8e-50 


I 179.7 


| 235 


LRR 


| Leucine Rich Repeat 


1.2e-30 


| 115.3 


236 


LRR 


1 Leucine Rich Repeat 


6 .7e-29 


j 109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycytidylate 
deaminase 


2.5e-0S 


31.1 


245 


ig 


Immunoglobulin domain 


6.7e-08 


30.5 


248 


wnt 


wnt family of developmental 
signaling protei 


9.1e-270 


742.6 


250 


mito_carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_ef f lu 

X 


Cation efflux family 


2 . 8e-33 


124.0 


256 


SH3 


SH3 domain 


3 .9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2 .6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1 . 6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B- type 


6.5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6.3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 .2e-150 


512.5 


271 


Choline_kina 
se 


Choi ine/ethanol amine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80 .6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269.9 


280 


WD4 0 


WD domain, G-beta repeat 


7.8e-73 


255.4 


.281 


WD4 0 


WD domain, G-beta repeat 


7 .8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 .6e-24 


93 .4 


287 


Exonuclease 


Exonuclease 


1 .4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


294 


Zf-C2H2 


Zinc finger, C2H2 type 


1 .4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 . 2e-125 


430.0 


296 


mito_carr 


Mitochondrial carrier proteins 


4 .le-59 


205.5 


297 


HMGbox 


HMG (high mobility group) box 


6 . 7e-29 


109.4 


302 


Glycos trans 
f_4 


Glycosyl transferase 


5e-87 


302.5 


304 


tRNA- synt_2 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2.7e-44 


160.6 | 


308 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


5 ,2e-39 


126.1 


309 


DNA_polymera 
seX 


DNA polymerase X family 


2 ,4e-64 


227.2 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6.8e-19 


65.9 


313 


Ets 


Ets -domain 


8.1e-60 ! 


192.3 


315 


Kelch 


Kelch motif 


1 .3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3 .2e-35 


130.4 


318 


sugar_ tr 


Sugar (and other) transporter 


0.0003 | 


-73.1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4.5e-143 


331. S 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMGjbox 


HMG [high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


B.le-81 j 


281.9 


331 


chzomo 


1 chromo * ( CHRroma tin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ 
NO: 
135* 



ID 



339 
"340" 



342 
34T 



346 



347 



351 



352 



354 

355 



358 



359 



361 



362 



367 



368 
369 



372 



373 



376 



377 



380 



381 

382 



384 



386 



388 



389 



390 



392 



393 



394 



396 



397 



399 



400 



401 



402 



404 



405 
"406* 



410 



411 



415 



418 



419 
120" 
"42T 
"42T 



425 



426 



427 



PFAM NAME 



vwa 



ras 



zf -C2H2 



zf-C2H2 



pkinase 
pkinase 



EGF 



ank 



TBC 
PHD 



DUF6 



zf-C2H2 



ank 



ArfGap 



LRR 



laminin G 
PP2C 



LIM 



KRAB 



ion trans 



Beach 
pkinase 



AMP-b inding 
HECT 



DESCRIPTION 



von Willebrand factor type A 

domain ___ 

Ras family 



Zinc finger, C2H2 type 



Zinc fi nger, C 2H2 type 

Immunol _____ 

Eukaryotic protein kinase 

domain 

Eukaryotic protein kinase 

domain _ 

EGF-like domain 



p -value 



PFAM 
SCORE 



2 ,3e-07 



37.9 



7.8e-07 



-59.1 



8 .2e-64 



225.4 



2 .4e-85 
0 . 0005 



297.0 
18.0 



6.5e-65 



229.1 



6.5e-65 



229.1 



8.5e-20 



79.2 



Ank repeat 



TBC domain 



Integr al membrane protein DUF6 
Zinc finger, C2H2 type 



2.5e-101 



350.0 



5.1e-15 



63.3 
37.4 



0.033 



15.8 



Ank repeat 



Putative GTP-ase activating 
protein for Arf 



7.4e-20 



79.4 



6 .6e-34 



126 .1 



Leuci ne Rich Repeat 
Laminin G domain 
Protein phosphatase 2C 



LIM domai n containing proteins" 
KRAB box ~ 



4 .7e-53 



5.4e-10 



189.7 



46.6 



8 .8e-44 



158.9 



1.5e-33 
5.3e-20 



121.7 
73 . 9 



9.9e-15 



57.1 



4.8e-23 



90 . 0 



Ion transport protein 



2 .9e-09 



-4.2 



Beige /BEACH domain 



4.9e-208 



704 .5 



Eukaryotic protein kinase 
domain 



1.6e-94 



327.5 



ank 



zf-C2H2 



mito 



carr 



TPR 



SH3 



AAA 



spectrin 



AMP-binding enzyme 

ubiquit 

transferase) . 
Ank repeat 



in- 



1.4e-07 
1.3e-07 



-140.3 
-13.5 



2.5e-101 



350.0 



Immunoglobulin domain 



9.5e-06 



23 .6 



Zinc finger, C2H2 type 



1.7e-42 



154.6 



Immunoglo bulin domain 

Mitochondrial carrier proteins 



2 .8e-15 



54 .3 



3.5e-67 



233.2 



TPR Domain 



6.1e-17 



69 .7 



SH3 domain 

ATPases associated with various 



3 .5e-09 



4.le-2l 



cellular act 



Spectrin repeat 



2.1e-67 



zf-C2H2 



fn3 



WD40 



El__dehydrog 
Tn3 



LRR 



cadherin 



zf-CXXC 



RhoGEF 



F-box 



SNF2 N 



CPSase_L_cha 
in 



LRR 



DENN 



RasGEF 



ank 



G-patch 



pkinase 



Plexin_repea 

t 

Plexin repea 



Zinc finger, C2H2 type 



0.0066 



Fibronectin type III domain 



4.1e-102 



WD domain, G-beta repeat 
Dehydrogenase El component 



0.00049 



3e-119 



Fibronectin type III domain 



Leucine Rich Repeat 



2.1e-10 



Cadherin domain 



8.1e-81 



CXXC zinc finger 



5e-15 



RhoGEF domain 
F-box domain. 



l.le-23 



4.2e-06 



SNF2 and others N- terminal 
domain 



5.8e-16 



Carbamoyl -phosphate synthase 
(CPSase) 



1.5e-172 



Leuc ine Rich Repeat 
DENN (AEX-3) domain 



3.8e-24 



2e-58 



RasGEF domain 



Ank repeat 



8.1e-43 
1.4e-l53 



G-patch domain 



le-19 



Eukaryotic protein kinase 
domain 



2.2e-31 



Plexin repeat 



0.0023 



Plexin repeat 



0.0023 



43 .9 



83 .6 



23 7.3 



23 .1 



352.6 



26 .8 



409.6 



1719.6 



48 .0 



281.9 



63.4 



92.1 



33.7 



61.6 



586.6 



93.6 



207.5 



155.7 
523.7 



78.9 



117.1 



24.6 



24.6 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 




t 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8 . 6e-ll 


39 .2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214 .0 


432 


SH3 


SH3 domain 


3.4e-16 


67.2 


433 


GTP_CDC 


Cell division protein 


2 . le-114 


393 .5 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


4.6e-194 


658.1 


438 


Ricin B lect 
in 


Similarity to lectin domain of 
ricin b 


0.0D85 


10.5 


441 


Alpha_ adapt i 
n_C 


Alpha adapt in carboxyl- terminal 
domai 


1.2e-256 


866.0 


442 


Alpha adapt i 
n__C 


Alpha adaotin carboxvl- terminal 
domai 


1 . 8e-235 


795 7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1 . 9e-65 


230.9 


445 


LON 


ATP -dependent protease La (LON) 
domain 


0 . 00012 


-17 . 1 


446 


ig 


Immunoglobulin domain 


0.00011 


20.1 | 


t 451 


sushi 


Sushi domain (SCR repeat) 


1 . 4e-18 


75.2 


452 


fn3 


Fibronectin type III domain 


1 . 5e-06 


35.2 1 


j 454 


pyridoxal de 
C 


Pyridoxal - dependent 
decarboxylase conse 


8 . 3e-14 


50.3 1 


456 


kinesin 


Kinesin motor domain 


4 . 9e-217 


734 .4 


457 


neur__chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597 .1 


458 


Josephin 


Josephin 


0.0002 


18.7 


468 


bZIP 


bZIP transcription factor 


1 . 7e-07 


31.8 


470 


NTP_t r ans f e r 
ase 


Nucleotidyl transferase 


6 . 3e- 06 


-26 3 

mm- \J m mm* 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing oroteins 


0 . 00021 


20 7 


477 


zf-RanBP 


Zn-fincrer in Ran bindincr 
protein and others. 


0 . 028 


21 0 

mm mm> m W 


479 


WD40 


WD domain, G-beta repeat 


6 .5e-18 


73 .0 


480 


KRAB 


KRAB box 


le-31 


118.8 

mm mm w » W 


481 


ArfGap 


Putative GTP-ase activatincr 
protein for Arf 


8 4e-66 


232 0 


485 


SH2 


Src homolocrv domain 2 


0 . Oil 


11 4 


486 


Clq 


CI a domain 


4 .3e-74 


259 6 


487 


derm 


Double- stranded RNA binding 
motif 


l.le-47 


171.9 


489 


zf-C2H2 


Zinc finger, C2H2 type 


4.8e-153 


521.9 


490 


Alpha adapt i 
n_C 


Aloha adaotin eairboxvl— tenriinal 
domai 


3 4e-222 


751 6 


492 


SKI 


Shikimate kinase 


1 .2e-10 


48 . 8 


497 


ENVpolyprot 
ein 


ENV polyprotein (coat 
polyprotein) 


2 . 6e- 22 


77 . 6 


498 


abhydrolase 
2 


Phospholipase/Carboxylest erase 


0.041 


-48.1 


500 


rrm 


RNA recognition motif. 


5.4e-34 


126.4 


501 


WW 


WW domain 


4.6e-18 


73 .4 


502 




Immunoglobulin domain 


l.le-10 


39.5 


504 


abhydrolase 


alpha/beta hydrolase fold 


0.045 


-3.6 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na K ATPase 

c 


Na+/K+ ATPase C- terminus 


2 .3e-145 


496 .3 

mr m* « 


509 


Exonuclease 


Exonuclease 


1 ,3e-56 


201 . 5 

mm- \r tmm • 


510 


Glycos trans 
fJL 


Glycosyl transferases group l 


2.9e-06 


27.0 


511 


Glycos trans 
f _1 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1. 9e-09 


38.5 


514 


pro_isomeras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


1.8e~63 


221.4 
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SEQ ID 
NO: 



515 



516 



523 



526 



528 



530 



531 



532 



533 



534 
535 



536 



53 7 



538 



PPAM NAME 



EGF 



Surp 



3-9 



UBX 



adh zinc 



SAM 



adh_short 
ml to carr 



mi to carr 



thiolase 
FMO-like 



SCAN 



tRNA-synt_l 



tRNA-synt_l 



DESCRIPTION 



p-value 



PFAM 
SCORE 



EGF- like domain 
Surp module 



1.9e-18 



74.7 



4.3e-38 



140 .0 



Immuno globulin domain 
UBX domain 



3.3e-06 



25.0 



l.le-34 



Zinc-binding dehydrogenases 



2.7e-34 



SAM domain (Sterile alpha 
motif) 



0.046 



short chain dehydrogenase 



0.0025 



Mitochondrial carrier proteins 
Thiolase ' " 



2e-61 



3.5e-183 



Flavin -binding monooxygenase- 
like 



SCAN domain 



tRNA synthetases class I (I, L, 
M and V) 



4e-55 
3.1e-136 



tRNA synthetases class I (I, L, 
M and V) 



3 .le-136 



128.6 



127.4 



10.0 



-34.1 



Mitochondrial carrier proteins | 2.5e-8I- | 281.7 



213 .5 



622.0 



1153.7 



196.6 



466.0 



466.0 



539 



540 



tRNA-synt_l 



tRNA synthetases class I (I, L, 
M and V) 



1.9e-117 



tRNA-synt__l 



tRNA synthetases class I (I, L, 
M and V) 



3. le-136 



403.6 



466.0 



541 



543 



544 



545 



547 



548 



549 



551 



vATP-synt E 



zf-C2H2 



ATP synthase (E/31 kDa) subunit 
Zinc finger, C2H2 type 



5.9e-85 
5.5e-69 



DUF101 



Protein of unknown function 
DUF101 



8.5e-38 



TGFb_propept 
ide 



TGF-beta propeptide 



I.Ib-67 



WD40 



WD domain, G-beta repeat 



2.6e-32 



RHD 



MMR HSR1 



Rel homo logy domain (RHD) . 
GTPase of unknown function 



1.6e-238 



5.4e-67 



HECT 



HECT-domain (ubiquitin- 
transferase) - 



4.3e-127 



295.7 



242.6 



139.0 



238.2 



120.8 



686.2 



236.0 



435.6 



554 



555 



MHC_II_alpha 



Class II histocompatibility 
antigen, alp 



3.5e-74 



2f -UBR1 



Putative zinc finger in N- 
recognin 



3 .3e-l6 



259.8 



67.3 



556 



561 



562 



564 



566 



567 



569 



570 



571 



572 



573 



575 



576 



577 



578 



579 



580 



583 



Kelch 



Kelch motif 



5.5e-29 



AMP-binding 



PABP 



AMP-binding enzyme 

Poly- adenylate binding protein, 
unique domai 



2 .8e-06 



4.9e-38 



Gag_p3 0 



Gag P30 core shell protein 



PWWP 



PWWP domain 



1.2e-67 



B.le-16 



SCAN 



pkinase 



SCAN domain 

Eukaryotic protein kinase 
domain 



pkinase 



Bukaryotic protein kinase 
domain 



CN_hydrol as e 



myosm__head 



myosin^ head 



Surp 



Surp 



DNA_pol_B 



PDZ 



LRR 

neur chan 



sushi 



Carbon-nitrogen hydrolase 
Myosin head (motor domain) 



Myosin head (motor domain) 
Surp module 



Surp module 



DNA polymerase family B" 



PDZ domain 
or GLGF) . 



(Also known as DHR 



Leucine Rich Repeat 



Neurotransmitter-gated ion- 
channel 



Sushi domain (SCR repeat) 



7.3e-68 



1.5e-84 



1.5e-84 



0.00081 



1.7e-23 



1.7e~23 



8.3e-09 



4.9e-21 



109.7 
-163.7 



5.9e-177 



139.8 



238.2 



66.0 



238.9 



294.3 



294.3 



-79.7 
1495.2 



1490.4 



91.5 



91.5 



1138.6 



42.7 



83.3 
601.3 



1673.0 
116.3 



584 



586 



DEAD 



DEAD/DEAH box helicase 



7.3e-36 



KH-domain 



KH domain 



2.9e-13 



57.5 
61.2 



587 



589 



590 



591 



G-patch 



G-patch domain 



2.3e-14 



LIM 

broraodomam 



LIM domain containing proteins 



2.3e-36 



Bromodomain 



6.6e-32 



133.4 
114.7 



bromodomain 



Bromodomain 



6.6e-32 



114.7 



252 



WO 01/53312 



PCT/US00/34263 



1 SEQ ID 
NO • 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


592 




TiiaanH-hinrii ncf Hnrnsin of* 
nucl&ar hoirnione 


3 5p»-20 
-> * jc xx 


ft*7 1 

O / ■ X 


593 


PHD 


PHD- fincrer 


3 8e-12 




594 


cadherin 


Cadherin domain 


4.2e-99 


342 .7 


596 


t>1c i na 


domain 




jX? 


597 


WD40 


WO domain fl-bpha Y*£*ri«at- 


o nnn54 


x o • / 


600 

W w w 


FG-GAP 


PG-GAP rpnpat - 




XOx . 2/ 


602 

W V *M 


G AdsiTit- CT 


Gan una -aflanh i ti f — 1~ p T*m ^ nnc 
uaiiuuci aua^; u in j v— LCJLiiiniUa 


X . XC JJ 


1 Q1 R 


603 

\J \f *J 


nk i na 


Rnka Y~\m t" i 0 ni*"Ot* pin V i na ca 
domain 


x . jtr o o 


inn d 


605 


Collagen 


Collagen triple helix repeat 

(20 conies \ 


8e~42 


152.4 


606 

w V V 




M i 1" nrhnndri a 1 rarTipr nrrifpi tiq 




X J X.J 1 


608 

w v v 


PWWP 
* nils 


PWWP domain 


X • OC X o 


XW t m mJ 


609 


PWWP 


PWWP domain 


2 . 6e-28 


107.5 


613 

W bX. *J 


CAP G1.Y 


CAP-Glv domain 




*>n i 

xU • X 


615 

U J. J 


RPX HNA hind 

ing 


DNa-h-i -nd-i nci domain 






616 


If H npai n 


i\J.llCOXll IUULUJ. ULJUlCtXll 


1 lo.RI 


0 Q A ft 
X 04 • C 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278.5 


61 R 

O X o 




ixnt unyer/ L»jriL.f4 type \K±ii\a 

■f i ncf^T*^ 
x iiiycx / 




XJ . X 


620 


MATH 


MATH domain 


7.8e-05 


22.2 


D X X 


I LJJ lt-JiaJjJ Ilel |_ do 

A 

"S 


Jrt Otcin UyiOS J,uc pxiwopnaCaSc 


x . *e** 


lit £ 

xZ X . b 


622 


pkinase 


Eukaryotic protein kinase 

doma i n 

LIU 1 L Id X 11 


4 .4e-40 


146.6 


623 






x . iB'li 


31 . O 


624 

V fat ^2 


mo 1 vhH on 1* r* i 

XL 


PTftlfawrtt 1 i f mol vh/intihorin 
rLUjs.aiyuLxu mux yuuuu Lei jlh 

ovi dovp Hit r* a q 


x ■ *±e — xx 


fi X . X 


625 


TPR 


TPR Domain 


l.le-17 


72 .2 


627 


rMMP V-\ -j nd i nrr 
Linrjr u x i lvA j. l ly 


C^rt* 1 nii a H •! l«i { n/1i n<*r 

cyuiic nucieoL xue-Dinuiuy 
d nm a i n 




*> n c c 
xU o . o 


630 


adh — short 


short chain dehydrogenase 


5e-17 


70.0 


631 


x w*n* 


7.inr* f inrror PTUO h\rr>o 
ZiX^iv^ luimcl / v» ^ n ^ l. y u c 


X t xe O o 


1 ft "7 T 
JU / .1 


632 

O *J *o 


X X III 


ivLM/^ XVcCvJy IIX L.XQJX1 UlOl.lI . 


^te** us 


*a ft c: 

JU . J 


635 

W J J 


JJl\.XilCL 9w 


domain 


x . oe-xuft 


TI C ft *7 


636 

O JO 




H*ovlf VioaH rlr^nna H n 




IUj . U 


637 


pkinase 


Eukaryotic protein kinase 

d om a i « 


3.8e-70 j 


246.5 


642 


TPR 


TPP rioma i n 


*t . oc u o 


Aft i 

%U . X 


643 


c* "f Viand 






1 ftd c 
Xu4 . O 


647 


SNF2 M 


and ot*h.#*Y*» M-fprminal 

domain 


X . X tr X V/ JL 


mi i 


648 


PfiGlldrtU RVTife 

h_2 


DMA "rvcipn H 01 1 T*"i dvrl ahp avnhhaap 
*um yQcuuuuLiuyiaic ayuuucioc 


1 q P .c;c; 


17 / <D 


650 


zf-C2H2 


Zinc fincrer. C2H2 tvoe 


0 0087 


22 7 


651 


ank 


Ank repeat 


1.3e-17 


71.9 


652 


I LWEQ 


I 1 LWEO doma in 


9 5e-10"L i 


341 0 


653 


neur chan 


NGUif otiranami tt^if -a p at p ion-* 
channel 


~ » X C X / X 


581 8 i 
o x . a 


654 


tsp 1 


ThrombosDondin tvne 1 domain 


4 le-47 


169 9 


659 


FH2 


Formin Homoloorv 2 Domain 


le-107 


371 2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5 3e-45 

*• J 


162 9 


662 


C2 


C2 domain 


6 7e-19 i 


76 2 


663 


C2 


C2 domain 


6 7e-19 


76 2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


667 


GST 


Glutathione S-transf erases . 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


672 


ABC_tran 


ABC transporter 


5.3e-60 


212 .8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93 .3 
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SEQ ID 

■a, av^a. 

NO: 


PFAM NAME 1 


DbbCKlrilUJM 


Lf value 1 


PFAM 

SCORE 

aaV Na* Vf* aW*l Will 


675 


WD40 J_ 


WD domain, G-beta repeat j 


4.8e-24 j 


93.3 


676 


LRR j 


Leucine Rich Repeat | 


\J - U VJ J. J 


95 2 


679 


2f-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type \ 


2.6e-29 | 


107.7 


680 


ZE-C2H2 j 


zinc ringer, t/n^ uype j 


5 2P-05 


30 1 

aJ W a -*» 


681 


CH | 


Calponin homology (CH) domain | 


2.4e-17 | 


71.1 


682 


DSPc | 


Dual specificity pnospnatase, 
catalytic doma | 




i -> o » o 


683 


zf-C3HC4 


Zinc finger, C3HC4 type (RING j 
finger) ] 


0.051 


10.8 


687 1 


Synapsin | 


Synapsin J 


n 1 
! 


1890 8 


689 


PR55 r 


Protein pnospnatase 2 a 
regulatory suounic rK | 


n 

I 


1038 8 


691 


homeobox } 


Homeobox domain j 


O . -J w «J V 1 


112 4 

X X t» ■ ^ 


696 


Peptidase_M2 
4 j 


metallopeptiaase ramny s r i^ t t 




210 5 


697 


RhoGEF | 


xuioc^h. r QOina, in \ 


af • a># ^ aV | 


128 . 9 


698 


PHD 


jfriL/- 1 mger j 


0 008 1 


9 .3 


701 


2I-C2H2 j 


^lnc linger, u^ri^ cype s 


5 5e-123 

• ■>/ ala >y aV | 


422 .0 


702 


Sulfatase j 


smtatase 




781.6 

t aa>> «* } 


i 703 


2f-C2H2 1 


Zinc finger, C2H2 type j 


5.7e-20 j 


79.8 


707 


Acyl trans f ! 


Acyl trans t erase aomam 


1 1 p-Op. f 


88 8 

WW* w 


708 


WD40 j 


WD domain, o-oeta repeat i 


4 fte-19 1 


76 . 7 


710 


Ran_BPl | 


RanBPl aomam. 


O ■ tC Y/u 


-7 3 

9 * a># 


713 


DEAD | 


DhJUj/DoAri oox nsiicase 


Q 9e-42 


134 . 9 

mk a-* * •> — ' 


714 


PH 1 


PH domain 


1.6e-09 


39.0 


715 


DSPc 


Dual specificity pnospnatase, 
catalytic doma 


X . O / 


138 2 


717 


Sialyltransf 


Sialyltransterase tamiiy 


*7 Cp_T1 


115 9 

X X —> . -/ 


718 




Immunoglobulin domain 




10 0 8 


719 


integrin_B 


Integrins, beta chain 


1 n 


1125 4 


720 


zf -C3HC4 


Zinc finger, C3HC4 type \kincj 
finger) 


X . 1c UO 


^9 4 


722 


Peptidase_C2 


Calpain family cysteine 
protease 


"Jp M 1 AC 
! J " S. L ±Z> 


495 9 


723 


ig 


Immunoglobulin domain 


1 9 Op-fit? 


92 4 

Am. na 


724 


F-box 


| F-box domain. 


n 007 

1 V ■ W w t 


23 0 


725 


Nop 


[ Putative snoRNA binding domain 


1 8.le-58 


1 205.5 


726 


Nop 


| Putative snoRNA Oinding domain 




i one c 

I ^U 3 . J 


727 


WD4 0 


| WD domain, G-beta repeat 


| 7.5e-26 


99.3 


730 


dsrm 


I Double- stranded RNA binding 
( motif 




1 1Z . X 


731 


dynamin 


| Dynamin family 


f 4 O fi _ 1 c 
4 . Zc xo 




733 


zf-CCCH 


Zinc finger c-x8-c-x5-l.-x.j-h 
type 


1 o Ro-1 n 

1 / . oc 1U 


41 7 


735 


CDP- 

OH P trans f 


CDP-alConol 
| phosphatidyl transferase 


4 2e-26 

I • 


100 . 1 

1 a#* \*f W • a** 


738 


DEAD 


1 t>tti "7v TN /T^CTiTT V»»»w Vial "1 C£> 

I DbiAD/Dc^vtl OOX neilCdSc 


1 8 6e-57 


182 . 5 


«aa a 

739 


TSC22 


| loC- iJLf dip/ nun lamiiy 


6 5e-32 


I 119 . 5 


742 


ras 


1 Kas ramiiy 


? 2 2e-100 

1 4W • at* V*» «W **» 


346.9 


743 


PMI typel 


i r/nospnomannosc launiciaac ^Ye^ A 


^ 1 . 2e-243 


1 822 . 9 


747 


trypsin 


Trypsin 


j 6.4e-88 


279.4 


748 


Jcazal 


Kazal-type serine protease 
J inhibitor domain 




j 187.4 


749 


efhand 


j EF hand 


1 OiJC UO 


33 1 

aJ *>J ■ aW 


751 


PHD 


| PHD- ringer 


A Qp-1 

I *± . -7 C lO 


66 7 


752 


zf -C2H2 


| Zinc ringer, C2n2 type 




l 83 9 

I UI _J « a-* 


753 


Hydrolase 


haloacid denalogenase-iixe 
1 hydrolase 


b . J.c- IX 


1 49 a 


7R4 


9 


1 Ribosomal L3 9 orotein 


0.00018 


26.7 


755 


PH ' 


j PH domain 


[ 3.6e-14 


j 55.7 


758 


SCAN 


) SCAN domain 


1.4e-53 


J 191.5 


759 


PA 


j PA domain 


j 0.0065 


23.1 


760 


arf 


j ADP-ribosylation factor family 


| 2.2e-19 


j 77 . 8 


761 


| CIDE-N 


| CIDE-N domain 


j 2.2e-40 


1 147.6 



254 



WO 01/53312 



PCT/US00/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


histone 


Core histone H2A/H2B/H3/H4 


9 . 9e-53 


188 .6 


763 


zf-MYND 


MYND finger 


4.1e-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188 6 


767 


VVfC 


von Willebrand factor type C 
domain 


2 . 9e-34 


127 .3 


769 


ef hand 


EF hand 


4.8e-ll 


50.1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains) 


2.4e-53 


181.6 


772 


ras 


Ras family 


7e-90 


312.0 


773 


Sulf atase 


Sulf atase 


le-142 


487.5 


775 


zf-C2H2 


Zinc finger, C2H2 type 


1 . le-12 


55.5 


776 


zf -C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55 . 5 


111 


Zf-C2H2 


Zinc finger, C2H2 type 


1. le-12 


55 . 5 


11B 


rrm 


RNA recognition motif 1 . 


2 .le-32 


121.1 


779 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1 . 5e~76 


236 . 6 

mm> m»* W ■ \J 


780 


spectrin 


Spectrin repeat 


3 . 7e-29 


110 . 3 


781 


mito_carr 


Mitochondrial carrier proteins 


4 .6e-57 


198 . 5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GIiGF) . 


4 . le-07 


37 . 1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21 . 7 


786 


ras 


Ras family 


5 .3e-39 


143 . 0 


787 


I RNase_HII 


Ribonuclease HI I 


2 .5e-67 


237 .1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 


5.4e-108 


372 .2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147 .4 


796 


ARID 


ARID DNA bindinq domain 


1 .6e-20 


81 . 6 


797 


trypsin 


Trypsin 

mm ^ mmr mm* mm* w ~ 


9 .9e-20 


64 . 8 


799 


CH 


Calponin homoloay (CH) domain 


3 .7e-15 


63 . 8 


801 


Gal- 

bincMLectin 


Vertebrate galactoside-binding 
lectin 


4 .le-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0 .00082 


26 . 1 


806 


TBC 


TBC domain 


1 .8e-26 


101.4 i 

A W *mm * mm 


807 


TBC 


TBC domain 


1 . 8e-26 


101 . 4 


808 


CNJiydrolase 


Carbon- nitrogen hydrolase 


8 .8e-80 


278.5 


811 


CBFDJKTFYB HM 
F 


His tone- like transcr lotion 
factor 


6e-14 


59 . 8 


812 


adh_short 


short chain dehydrogenase 


8.1e-20 


79.3 


814 


IMP4 


Domain of unknown function 


3 .3e-71 


250 0 


.815 


zf-C2H2 


Zinc finger, C2H2 type 


8 . 2e-66 


232.1 


816 


Pent tRNA hv 
dro 


Pen t i dvl - 1 RNA hvdrol aee 


1 6e-37 


138 0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74.3 


826 


IF5 eIF4 elF 

m*» *■ •mW mm* m% mw A 

2 


eIF4 -cramma/ eIF5 /elF2 -eosilon 


1 6e-32 


121 5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


831 


LRR 


Leucine Rich Repeat 


2 . le-26 


101.1 


832 


laminin_EGF 


Laminin EGF-like (Domains III 
and V) 


2e-57 


204 . 2 


839 


rrm 


RNA recognition motif. 


1 .3e-22 


88 . 5 


840 


Y__ phosphatas 
e 


Protein- tyrosine nhosohatase 


2 . 6e-119 


409 !8 


841 


pkinase 


Eukarvotic Drotein kinase 
domain 


3 .4e-100 


346 .3 


844 


Ribosomal L2 
2e 


Ribosomal L22e Drotein familv 


le-64 


228 . 4 


846 


IBR 


IBR domain 


9e-l5 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 i 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113 .2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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SEQ ID 
NO: 



858 



859 



861 



863 
"864 



866 



867 



872 



874 



879 



882 



887 



889 



893 



898 



899 



900 



904 



910 
"9lT 



923 



924 



925 



926 



928 



929 



930 



931 



936 



937 



940 
"944" 
"948 



949 



950 



PFAM NAME 



DESCRIPTION 



p-value 



SRCR 



lactamase B 



COX6A 



rrm 



rich domain 

Scavenger receptor cysteine- 
rich domain 

Metallo-beta-lactamase 

superfamily 

Cytochrome c oxidase subunit 

Via 

RNA recognition motif . 



0.012 



3 .4e-58 



5.4e-45 



PRK 



mitocarr 
HSP90 



Phosphorib ulokinase _ 

Mitochondrial carrier proteins 



5.le-62 



2.9e-53 



Hsp90 protein 



4.7e-l58 



Immunoglobulin domain" 



4e-12 



zf-C2H2 



Zinc finger, C2H2 type 
Core histone H2A/H2B/H3/H4 



7e-135 



histone f _ 

CPSase_L_cha Carbamoyl -phosphate synthase 



4.9e-41 



2.1e-218 



in 



(CPSase) 



Ribosomal_Sl | Ribosomal protein S12e 

2e 

serpin 



2.1e-98 



Serpins (serine protease 
inhibitors) 



2.5e-42 



Patatin 
RA 



Patatin 

Ras association (RalGDS/AF-6) 

domain 



1.2e-51 
0.044 



PFAM 
SCORE 



1025.4 



-6.0 



206.7 



162.9 



219.4 



185.5 



538.5 



44.1 



461.5 



149.8 



739.0 



340 .3 



145.7 



DUF92 



Integral membrane protein DUF92 | 2.7e-12 



sugar_tr 



Sugar (and other) transporter | 8.2e-63 



DUF28 



IP trans 



Domain of unknown function 

DUF28 

"Phosphatidyl inositol transfer 

protein 



1.3e-43 



6.5e-98 



DEAD 



DEAD/DEAH box helicase 



1.5e-48 



KE2 



KE2 family protein 



7e-61 



KE2 



KE2 family protein 



4.3e-51 



zf-C2H2 
ras 



Zinc finger , C2H2 type 
Ras family 



2.7e~57 
2.3e-75 



TPR 
~GBF 
"GBP" 



TPR Domain 
Guanylate- binding protein 
Guanylate-binding protein" 



3.2e-22 
8.9e-253 
l.le-239 



WD40 
"PH 



WD domain, G-beta repeat 
PH domain 



2.6e-26 
1.3e-09 



zf-C2H2 



Zinc finger, C2H2 type 



2.5e-39 



182 
8.0 



54.3 



222 .1 



158.3 



338.7 



156.5 



215 .7 



183.2 



203 .8 
263.8 



87.2 
853 .1 
809.6 



100.8 
39.4 



Epimerase 



NAD dependent 

epimerase /dehydratase family 



5,e-07 



TBC 
WD40 



TBC domain 

WD domain, G-beta repeat 



1.5e-09 
1.6e-25 



WD40 



WD domain, G-beta repeat 



8.2e-07 



Hydrolase 



haloacid dehalogenase - 1 ike 
hydrolase 



2.9e-05 



UQ con 



CH 



Ubiquiti n- conjugating enzyme 
Calponin homology (CH) domain" 



0.00033 



3 .3e-53 



WD40 

zf-C3HC4 



WD domain, G-beta repeat 



5.9e-48 



Zinc finger, C3HC4 type (RING 

finger) < 

Ribul_P_3_ep I Ribulose-phosphate 3 epimerase 



3 ,le-10 



7.2e-105 



lm 



family 



Ribul_P_3_ep Ribulose-phosphate 3 epimerase 

im [ family ___ 

"C2 | C2 domain 



1.2e-96 



NAP_family 



Nucleosome assembly protein 
(NAP) 



2.2e-62 
l.le-22 



abhydrolase | alpha/beta hydrolase fold 



0.011 



Tropomyosin I Tropomyosins 



3 ,2e-07 



pkinase 



Eukaryotic protein kinase 
domain 



3 .4e-75 



WD40 



WD domain, G-beta repeat 



1.8e-27 



Acyl transfer I Acyltransf erase 



1.6e-07 



ase 



144.1 



-88.5 



30.7 
98TT 



36.1 



29.1 



-27.6 



190.2 



172.7 



37.4 



361.8 



334.4 



220.7 
84™g 



3.1 



25.1 



263 .2 



104.7 



38.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0.014 


14 . 5 


954 


GFO IDH MocA 


Oxidoreductase family 

•* 


1 .3e-ll 


52.0 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86.1 


957 


CDP- 

OH_P_transf 


CDP- alcohol 

phosphatidyl transferase 


0.053 


-22.2 


959 


ras 


Ras family 


2.4e-97 


336.8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyl transf 


Acetyl transferase (GNAT) family 


1.2e-08 


42 .2 


962 


adh_short 


short chain dehydrogenase 


2.4e-31 


117.6 


963 


mutT 


Bacterial mutT protein 


5 .6e-06 


26.2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653.9 


970 


RNase_PH 


3 ' exoribonuclease family 


9e-24 


92 .4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3 6e-21 


83 7 


978 


Ribosomal_Ll 
7 


Ribosomal protein L17 


2.4e-20 


81.0 


979 


LIM 


LIM domain containing proteins 


5.8e-42 


152 .8 


980 


Calsequestri 
n 


Calsequestrin 


1. 7e-297 


1001 .7 


982 


HSP20 


Hsp20/alpha crystallin family 


1 . 2e-10 


43 .2 


983 


oxidored q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4 . 8e-63 


222.9 - 


988 


TBC 


TBC domain 


2 .2e-50 


180.8 


9B9 


TBC 


TBC domain 


2 .2e-50 


180 .8 


993 


tRNA_JLnt_end 
o 


tRNA intron endonuclease 


0.0017 


-34.2 


994 


homeobox 


Homeobox domain 


4e-18 


73.6 


997 


pyr redox 


Pyridine nucleotide-disulohide 
oxidoreducta 


0 . 012 


11 6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9 . 7e-123 


421 .2 


1001 


RA 


Ras association (RalGDS/AF-6) 
domain 


1 .2e-15 


65.4 


1004 


• DUF81 


Domain of unknown function 
DUF81 


0 . 099 


10.2 


1005 


actin 


Actin 


I.3e-174 


574.3 


1006 


actin 


Actin 


3 .le-130 


428.6 


1007 


cpn60 — TCP1 


TCP-l/cpn60 chaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 .le-44 


159 .0 


1009 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216 .6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 . 6e-61 


216.6 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .7e-15 


53.1 


1016 


tRNA-synt_2c 


tRNA synthetases class II (A) 


2 .3e-15 


55.2 


101B 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphoglycerate mutase family 


3 . 8e-18 


69.7 


1026 


HMG_box 


HMG (high mobility group) box 


8.4e-20 


79 .2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ_con 


Ubi qui tin- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0 .028 


16 .3 


1034 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2e-21 


84 .6 


1037 


KRAB 


KRAB box 


4 .8e-06 


32,4 


1038 


Cation^ef f lu 

X 


Cation efflux family 


7.1e-42 


152 .5 


1040 


ART 


NAD:arginine ADP- 
ribosyltransf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


I.9e-18 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 .7 


1045 


lectin c 


Lectin C-type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 
iso 


Glucosamine - 6 -phosphate 
isomerase 


0.00013 


-25.1 
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SEQ ID 
NO: 



1047 



1049 



1050 



1054 



1055 



1058 



1059 



1060 



1062 



1064 



1065 



1066 



1071 



1072 



PFAM NAME 



ligase-CoA 



DESCRIPTION 



p-value 



CoA-ligases 



4.5e-80 



Immunoglobulin domain 



1.7e-09 



Ribosomal__L2 
4e 



Amidase 



rrm 



annexm 
PMP22 Claudi 



n 



hpmeobox 



Acyl transfer 



ase 



AMP -binding 



LRR 



GTP1 OBG 



ig 



PHD 



Ribosomal protein L24e 



2e-33 



Amidase 



4.3e-152 



RNA recognition motif. 



378e-26 



Annex in 1 6 « 9e-4 4 

PMP-22/EMP/MP20/Claudin family I 0.023 



Homeobox domain 



3.2e-31 



Acyl transferase 



0.00065 



AMP-bmding enzyme 



6.6e-100 



Leucine Rich Repeat 



3,3e-14 



GTP1/0BG family 



4.8e-41 



Immunoglobulin domain 



8.4e-48 



PHD- finger 

DENN (AEX-3) domain 



6.8e-07 
8.3e-33 



1074 



1075 



1077 



1078 



1079 



1087 



DENN 



SCP 



SCP-like extracellular protein 



OLF 



Olfactomedin-like domain 



4 .7e-41 
2>2e-66 



raito carr 



Mitochondrial carrier proteins 1 le-42 



WD40 



WD domain, G-beta repeat 



6.2e-45 



START 



START domain 



1.5e-48 
3 .3e-63 



1093 



1094 



DSPc 



Dual specificity phosphatase, 
catalytic doma 



GSHPx 



Glutathione peroxidases 



9.6e-41 
2e-75 



1095 



DUF25 



Domain of unknown function 
DUF25 



1096 



DUF25 



Domain of unknown function 

DUF25 

Nitroreductase family 



6e-75 



1105 



1106 



1107 



1109 



1115 



1116 



1117 



1119 



1120 



Nitroreducta 
se 



1 .3e-13 



PTE 



Phosphotriesterase family 



1.3e-179 



DAGKc 



ras 



Diacylglycerol kinase catalytic 

doma in 

1 Ras family 



0.00049 



1.3e-15 



ArfGap 



HMG14 17 



HMG14 17 



FAA_hydrol a s 
e 



pkinase 



Putative GTP-ase activating 
protein for Arf 



9.7e-47 



HMG14 and HMG17 



4 .4e-21 



HMG14 and HMG17 

Fumarylacetoacetate (FAA) 
hydrolase fam 



9.9e-12 



2e-83 



Eukaryotic protein kinase 



domain 



1.4e-94 
9.2e-23 



1123 



abhydrolase 



alpha/beta hydrolase folcf 
Cyclophilin type peptidyi- 
prolyl cie-tr 



1129 



1131 



1132 



1133 



pro_i some ras 
e 



2.2e-5S 



DnaJ 



DnaJ domain 



1.6e-30 



WD40 



WD domain, G-beta repeat 



1.3e-19 



WD40 



WD domain, G-beta repeat 



1.8e-15 



1134 



1136 



1137 



PH 



Adap_comp_su 
b 



PH domain 

Adaptor complexes medium 
subunit family 



0.0015 



1.2e-256 



Adap comp_su 
b 



Adaptor complexes medium 
subunit family 



2.5e-209 



1139 



1141 



ras 



Ras family 



1.5e-86 



pkinase 



Eukaryotic protein kinase 
domain 



9.4e-74 



1152 



1153 



1155 



Acyl transfer 
ase 



Acy 1 t r ans f e ra s e 



1.2e-05 



IRS 



PTB domain (IRS-l.type) 



5.4e-55 



— s 

*g 



Immunoglobulin domain 



1.3e-31 



1157 



1159 



1160 



Asparaginase 

2 

GMC oxred 



Asparaginase 



6.4e-72 



GMC oxidoreductases 



4.7e~142 



zf-ANl 



ANl-like Zinc finger 



0.00021 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


linkerjiisto 
ne 


linker his tone HI and H5 family 


3.8e-14 


60.4 


1164 


DED 


Death effector domain 


3.9e-05 


30.5 


1165 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain (IRS-1 type) 


2 .6e-43 


157.3 


1168 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 


10.5 


1170 


abhydrolase 


alpha/beta hydrolase fold 


0.098 


-7.5 


1174 


SAP 


SAP domain 


3 .9e-10 


47.1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112 .5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33 .3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.00016 


24.7 


1182 


TCL1_MTCP1 


TCL1/MTCP1 family 


9.5e-56 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito^carr 


Mitochondrial carrier proteins 


1 .5e-62 


217.3 


1187 


UPAR_LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1188 


0 r n_DAP_Arg_ 
deC 


Pyr idoxal - dependent 
decarboxylase 


6.2e-128 


430.6 


1193 


Stathrain 


Stathmin family 


l.Se-90 


314 .0 


1194 


Stathrain 


Stathmin family 


1.8e-90 


314 .0 


1195 


: Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide -disulphide 
oxidoreducta 


3.1e-32 


111.8 


1197 


Glyco trans f 
_8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adh short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ubie_raethylt 
ran 


ubiE/C0Q5 methyltransf erase 
family 


1.3e-l21 


417.4 


1208 


7tm_3 


7 transmembrane receptor 


7.2e-09 


29.0 


1209 


ank 


Ank repeat 


3.9e-15 


63.7 


1210 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


efhand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


PX domain 


2.2e-15 


64.5 


1233 


PX 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44.0 


1241 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 


6 . 3e-6l 


215.8 


1248 


Glycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 


1254 


UQ_con 


Ubiquit in- conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


formyl trans 
f 


Formyl transferase 


4 .9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 ! 


1262 


G_glu_transp 
ept 


Gamma- glutamyl transpept idase 


1. 8e-ll0 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86.9 
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SEQ ID 
NO: 



1266 



1267 



1269 



1275 



1276 



1277 



1279 
1280 

1285 



1287 



1294 



PFAM NAME 



SCP 



K tetra 



ras 



zf-C3HC4 



abhydrolase 



abhydrolase 



trypsin 
"pbF 



zf-C3HC4 



ank 



K+ channel tetramerisation 
domain 



Ras family 



alpha/beta hydrolase fold 



Trypsin 
Phosphat idylethanolamine - 

binding protein 



6e-29 



2.8e-27 



1.3e-85 



4.2e-10 



4.4e-41 
1.3e-13 



Zinc finger, C3HC4 type (RING 
finger) 



Ank repeat 



1.7e-52 



0,026 



0.00026 
6.9e-41 



PFAM 
SCORE 



108 .0 



297.9 



37.0 



89.8 



B3.1 



132.0 
58.7 



187.8 



20.9 



1296 



1297 



1298 



1301 



1307 



1308 



1310 



PMP22_Claudi 
n 



Rhodanese 



Rhodanese-like domain 



3.2e-14 



LIM 



LIM domain containing proteins | 5.8e-21 



rnaseA 



mi to carr 



Pancreat ic ribonucleases 4.9e-43 
Mitochondrial carrier proteins I 2.le-53 



WD40 



WD domain, G-beta repeat 



1.6e-17 



u-PAR/Ly-6 domain 
Thioredoxin 



7.1e-20 



6077 



79.1 



186.0 



71.6 



75.5 



1314 



Aa trans 



Transmembrane amino acid 
transporter protein 



1316 



1320 



1327 



trypsin 



Trypsin 



RibosomalJLil 
3 



Ribosomal protein L13 



3.9e-62 



Armadil 1 o_s e 

g 



0.0054 



219.8 



23.4 



1328 
1329 



1330 



1331 



1333 



1334 



1335 



1336 



1337 



KRAB 



rrm 



2.le-40 



Bcl-2 



PX 



KRAB 



UPP_syntheta 



ee 



UPP_syntheta 
se 



DSPc 



DSPc 



TPR 



Apoptosis regulator proteins 

Bel -2 family 

PX domain 



0.014 



2 .le-10 



KRAB box 



Putative undecaprenyl 
diphosphate synt 



2.3e-89 



Putative undecaprenyl 
diphosphate synt 



1.8e-59 



Dual specificity phosphatase, 
catalytic doma 



1.2e-31 



T?R Domain 



48.0 



310.3 



211.0 



Dual specificity phosphatase, | 2 ,3e-12 | 54. 5 
catalytic doma 



1341 



1343 



1344 



1345 



1347 



1348 



mutT 



Band 41 



Kelch 



Antifreeze 



Antifreeze protein 



3Beta HSD 



3 -beta hydroxys teroid 
dehydrogenase/ isomera 



1.4e-44 



l.2e-l0 



0.086 



BTB 



5.3e-28 
0.033 



122.5 



161.5 



-177.2 



1349 



1350 



1352 



1353 



1355 



1356 



1357 



1360 



DUF6 



myosinjiead 



Nramp 



Natural resistance-associated 
macrophage pro 



1.2e-202 



S 100 



S-100/ICaBP type calcium 
binding domain 



5.3e-23 



DEAD 



C2 



DEAD/DEAH box helicase 
C2 domain 



3.6e-65 



2.4e-15 



RBD 



4.2e-57 



zf-C2H2 



Zinc finger, C2H2 type 



7.4e-i4l 



HMG14 17 



HMG14 and HMG17 



7.9e-40 



1088.7 



686.6 



89.9 



209.0 



64.4 



203 .1 
481.4 
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NO: 


rcAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


±J b Z 


Old 


bis domain 


3 . 8e-30 


113 . 6 


IJDO 


olo 


bib Qomain 


1 . 3e-28 


108 . 5 


1J 04 




—"a 'i a ^ afc v * A ^H^at 1 — « 4 ri| — 1-» — ^ ____ _____ ___ _J _____ 

xntmurLOCfiocuiin domain 


0 . 00026 


19 .0 


IJDO 




K+ cnannei tetramerisation 

OOIRalD 


1 . le-16 


^ ft ft 

68-9 




v_.ijxxciyfcj.ri 


Loiiogen cripie neiix repeat 

.4..U copies/ 


O 111 

<s . ^e - 1 1 3 


390 . 1 


137? 


X*l J.CHJ 


uildu uoiuaxii 


o . oc Jo 


in *j 
LSZ . / 


117K 




IvtvriO JJUX 


z . ie -JO 


141 . U 


1378 


ELM2 


ELM 2 domain 


2e-23 


91.3 


1 "3Qft 
JL<3 O VI 




mioreaoxin 


1 . Z&- 1 J 


ft O Q 

82 . 8 


lJ ox 


amc 


Ank repeat 


2 ,3e-83 


290 .4 




olo 


nio/po^ domain 


Je-n 


50 . 8 


i T m 

X.J DJ 




wd aomain, o-oeca repeat 


l . 6e- 19 


78 .3 


T 1 Q A 
1-5 04 


WD4 u 


wd aomam, G-oeta repeat 


o . 3e-24 


ft o ft 

92 . 9 


1 TOT 

lJ o / 




Zinc linger, C3HC4 type (RING 
linger) 


1 . le-09 


35.6 


1 "1 fi Q 


«y f _(*"5UO 
ZL -l«ZXlZ 


Zjinc linger, Lzn/ type 


b . 5e-50 


1 TO CT 




Z x 


.oHiC linger, L^nz type 


2.5e-85 


296.9 




KincBin 


Jxinesin motor domain 


7.8e-188 


637.4 


X J 




-oinc linger/ u^xi4& cype 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


86.6 


x*£ U Z 




DZ.1P transcription taccor 


0.035 


13.1 




SUyal LI 


bugar land otner; transporter 


0.003 


-101.5 


1406 


RhoGAP 


RhoGAP domain 


8.9e-47 | 168.8 


i /i n *7 


rni 


RNA recognition motif. 


le-3S 


132 .1 


i /i n □ • 


t n tj 
XjRK 


Leucine Rich Repeat 


2 . le-13 


58 . 0 


14U9 


Nebula n_repe 


Nebulin repeat 


6e-54 


192.6 






Ank repeat 


1 . 6e-17 


71.6 • 




T"_l * /— X y*B. Mk A n T 

K 1 D OSOTTia 1 ^Ljo 


riDosomal L5P family C- terminus 


8 . 2e-58 


ft ft 1 r* 

2 05 .5 


Idle: 


tl/pSill 


irypsin 


4 la.QC 
* « / c a 9 


5 7H 4 


1 41 £ 

X*± X o 


cinixiiuux all x 


/iuiinocransre rases ciass-i 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1. 6e-07 


33.1 


1 fll Q 


tjt\a n 
WU-1 U 


wd domain, G-oeta repeat 


2.2e-09 


44.6 


1422 


cadherin 


Cadherin domain 


8.3e-42 


152.3 


14^4 


SH3 


SH3 domain 


2 .5e-80 


260.3 


i /toe 


PHD 


PHD- finger 


3 .2e-17 


70.6 


14 z6 


PHD 


PHD- finger 


3 .2e-17 


70 .6 


1427 


ArfGap 


Putative GTP-ase activating 
protein tor Arc 


le-37 


138.8 




neiicase c 


Heiicases conserved c- terminal 
domain 


le-26 


102.2 


x^t 




wd domain, G-Deta repeat 


3.9e-07 


37.2 




xnosx tOl -r 


inositol monopnospnatase rami ly 


2.5e-10 


40.2 


1 43T 


UlltU tdi X 


wi tocnonariai carrier proteins 


4 .3e-83 


287.7 


143"? 

-L ^1 0 


Pin- 


tlC[ UUiuain 


2.9e-16 


66.2 


1434 


wda n 

trlX/l w 


wd domain, o-oeta repeat 


1.6e-13 


58.3 


143^ 


X11U fc> X 

P svnth 


iriyo-inosxuOJ. -i-pnospnate 

Qvnf*ha qp 


7e-228 


770.4 


1436 






1 .4e-34 


128.3 


1438 


1 1— 1 ■! ■ 

i cr 


x uuiiLiiifM x uxjux xzi Qumain 


1 ,3e-12 


45.6 


1440 


G Adant CT 


ociutlllcl aUapulil/ L Lciuiinus 


3 .4e-67 


236.7 


1441 


G Adant - CT 

VJ ■**.> 4 CI W L. X-* _l 


fla iMTns s ^ t5 v"\ ^™ *i »i ^ ^ v in t mi a 
vjclullilct— auapu lu f V* ~ Lc^iulIlUS 


3 .4e-67 


236.7 


1443 


Ketch 


X\fc:Xdl lUvJUxx, 


0.00013 


28.7 


1446 


ARID 


fTj\xu udu\ oxnuxng domain 


1.8e-21 


84.7 


1447 




zixne Linger, type 


9.4e-28 


105.6 


1448 

JL TX " U 


ttl If UXilLiXi IvJ 


/u v i-r -Dinaiiiy enzyme 


2.6e-07 


-145.1 


1451 


rrtn 


RNA r&cocrni t ion mot" i "F 


6 .5e-21 


82 .9 


1454 




Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyl trans f 


Sialyltransferase family 


5.4e-21 


83.2 


1460 


Aldose_epim 


Aldose 1-epimerase 


1 .9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT/TIG domain 


3.1e-19 


77.3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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1 SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 1 


p-value S 


PFAM 

COAT) f 

SCORE 




h_2 








1474 


DENN 


DENN (AEX-3) domain j 


1.3e-44 J 


161.6 


1475 


Cation_ ef f lu 

X 


Cation efflux family | 


4.6e-49 \ 


176 . 4 


1477 


TBC 


TBC domain 1 


Be-47 


169.0 


1478 


rrm 


RNA recognition motif. j 


2e-21 J 


84 . 6 


1480 


ig 


Immunoglobulin domain j 


5.5e-06 


24.3 


1484 


Telo_bind_al 
pha 


Telomere -binding protein aipna | 
subuni j 


0.028 | 


-225 . 9 


j 1485 


zf-C2H2 


Zinc finger, C2H2 type [ 


l.Be-68 | 


240 . 9 


1486 


pkinase 


Eulcaryotic protein kinase J 
domain | 


9.5e-13 


49.9 


1488 

j 


helicase_C 


Helicases conserved C- terminal j 
domain j 


1.4e-15 j 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 j 


0.079 | 


-132 .4 


1490 


ECH 


Enoyl-CoA hydratase/isomerase 
family 


5.2e-41 


149.7 


1491 

1 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt I 


5.9e-46 | 


166 .1 


1492 


LRR 


Leucine Rich Repeat j 


3.4e-19 "j 


77 . 2 


I 1495 


zf-C3HC4 


Zinc finger, C3HC4 type IRINU 
finger) j 


7.1e-10 i 


36 . 3 


1497 


pkinase 


Eukaryotic protein kinase j 
domain J 


le-22 


85.8 


1500 


SH3 


SH3 domain ! 


9.3e-05 


27.2 


1502 


homeobox 


Homeobox domain ! 


0 . 084 


13 . 8 


j 1503 


homeobox 


Homeobox domain ] 


0 . 084 


13 . 8 


|| 1505 


EGF 


EGF- like domain ; 


2 . 7e-23 


90 . 8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2 . 7e-21 


■ 84 . 2 


1508 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 


101.8 


1511 


PX 


PX domain 


1.9e-ll 


51.5 


1512 


Sulf atase 


Sulf atase 


j 2 . 8e-35 


j 130 . 7 


j 1516 


Syntaxin 


Syntaxin 


| 0 . 011 


[ -62.3 


1518 


amino tran_3 


Aminotransferases class- III 
pyr idoxal -pho 


9 ,7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


| 0.075 


| 11 . 0 


1521 

1 


RA 


Ras association (RalGDS/AF-6) 
domain 


I °- 013 


13.3 


1523 


RhoGAP 


RhoGAP domain 


j 2 ..5e-05 


| 18 . 7 


1528 


WD40 


WD domain, G-beta repeat 


| 5 .4e-24 


| 93 . 1 


1535 


IMS 


impB/mucB/samB family 


j 7.8e-95 


328.5 


j 1538 


FYVE 


FYVE zinc finger 


\ 3 .2e-27 


| 101 . 5 


j 1539 


DAGKC 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


1540 


Ocular alb 


Ocular albinism type 1 protein 


\ 0 


| 1184.7 


1653 


SAP 


SAP domain 


6e-06 


33 .2 


1654 


Amino_oxidas 

• 

e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1655 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157 . 0 


1656 


RhoGEF 


RhoGEF domain 


j 1.4e-24 


| 95.1 


1657 


MMR_HSR1 


GTPase of unknown function 


| 0 .0011 


| -45 . 5 


1659 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


2.5e-ll 


51 . 1 


1660 


actin 


Actin 


1 6.6e-21 


69.9 


1661 


BAH 


BAH domain 


| 1.7e-82 


j 2 8 7 . 5 


1662 


vwa 


von Willebrand factor type a 
domain 


i n 
1 


1909 4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


J 237.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1 1.3e-93 


j 324 .4 


1669 


Noll_Nop2_Su 
n 


NOLI /NOP2/ sun family 


l.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


j 5.4e-15 


46.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1672 


chromo 


1 chromo 1 ( CHRromatin 
Organization Modifier) 


2.1e-18 


67.7 


1674 


z£-CCCH 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


0.0025 


17.6 


1676 



Glyco_hydro_ 
47 


Glycosyl hydrolase family 47 


1.8e-187 


636 .2 


1677 


Glyco_hydrq_ 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1681 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1683 


MMR_HSR1 


GTPase of unknown r unction 


1.8e-78 


274.1 


1691 


rrm 


RNA recognition motif. 


1.8e-37 


137.9 


1692 


rrm 


RNA recognition motif. 


1.8e-37 


137.9 


1693 


AAA 


ATPases associated with various 
cellular act 


1.3e-81 


284.5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8.4e-82 


285.2 


1698 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3 .5e-53 


190.1 


1699 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-34 


126.6 


1700 


arf 


ADP-ribosylation factor family 


9e-19 


75.8 


1702 


GTP_J3FTO 


Elongation factor Tu family 


0.014 


11.4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194.4 


1707 


pkinase 


Eukaryotic protein kinase 
domain 


1.2e-88 


307.9 


1709 


WD40 


WD domain, G-beta repeat 


0.0035 


24.0 


1710 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1715 


ras 


Ras family 


4 .4e-41 


149.9 


1718 


HMG_box 


HMG (high mobility group) box 


8 ,3e-21 


82 .6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


HLH 

i 


Helix- loop -helix DNA- binding 
domain 


9.2e-10 


45.9 


1723 


dsrm 


Double- stranded RNA binding 
motif 


2 .9e-05 


30.9 


1724 


RrnaAD 


Ribosomal RNA adenine 
dimethyl a ses 


0.045 


9.2 


1725 


CIDE-N 


CIDE-N domain 


5.9e-40 


146.2 


1726 


HAT 


HAT (Half -A-TPR) repeats 


2.9e-44 


160.5 


1728 


efhand 


EF hand 


5 .le-20 


79.9 


1733 


Hist_deacety 
1 


Histone deacetylase family 


1.7e-104 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4 .6e-34 


12$. 6 


1739 


PI-PLC-X 


Phosphatidyl inositol - speci f ic 
phosphol ipase 


0.0023 


16.1 


1743 


ras 


Ras family 


3.7e-10 


-21.3 


1744 


ras 


Ras family 


3 ,7e-10 


-21.3 


1745 


RasGEF 


RasGEF domain 


3 .2e-49 


176. 9 


1746 


adh_short 


short chain dehydrogenase 


7.1e-08 


34.6 


1751 


zf -C2H2 


Zinc finger, C2H2 type 


9e-.39 


142.2 


1754 


fn3 


Fibronectin type III domain 


5.5e-101 


348.9 


1756 


2I-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


322.1 


1758 


rrm 


RNA recognition motif. 


0.017 


21.2 | 


1760 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


J> / OS 




bTPase or unknown function 


6 .4e-41 


149 . 4 


1769 


CN_hydrolase 


Carbon -nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


OxysterolJBP 


Oxysterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 


RhoGEF 


RhoGEF domain 


l .6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 1 


p -value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS:1416227.1(%CRN01 !.DOC) 
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TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS {MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1 


1-21 


0.991 


0.955 j 


2 


1-31 


0.995 


0.944 


3 


1-33 


0.949 


0.736 


4 


1-19 


0.970 


0.951 


5 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.863 


! 9 


1-46 


0.982 


0.901 


10 


1-21 


0.991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.939 


17 


1-27 


0.964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0.974 


0.850 


22 


1-33 


0.961 


0.895 


23 


1-19 


0.991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 j 


46 


1-19 


0.970 


0.951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-30 


0.991 


0.919 


75 


1-29 


0.958 


0.854 


38 


1-20 


0.986 


0.945 


94 


1-33 


0.994 


0.943 


97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0.570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 


126 


1-25 


0.955 


0.803 


129 


1-19 


0.963 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.825 


180 


1-27 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0. 953 


0.840 


196 


1-22 


0 .975 


0.916 


197 


1-22 


0.963 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM I 
SCORE) 


Means (mean 


199 


1-20 


0.935 | 


U • / Ul 


200 


1-23 


0.977 j 


0.773 


206 


1-30 


0.984 | 


0.890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 j 


0.850 


210 


1-40 


0.940 


0.670 


i 211 


1-28 


0.971 J_ 


0.849 


216 


1-24 


0.986 - j 0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 | 


0.871 


221 


1-19 


0.904 1 


0.553 


222 


1-21 


0.917 T 


0.555 


| 230 


1-19 


0.991 } 


0.959 


1 231 


1-26 


0.953 | 


0.800 


232 


1-25 


0.988 j 


0.826 


239 


1-23 


0.969 1 


0.828 


240 


1-17 


0.982 


0.955 


1 241 


1-17 


0.982 [ 


0.955 


245 


1-30 


0.970 j 


0.722 


1 248 


1-22 


0.976 - 


0.935 


1 249 


1-23 


0.968 


0.940 


I 252 


1-18 


0.971 | 


0.923 


261 


1-24 


0.883 j 0.587 


1 265 


1-18 j 


0.939 j 0.868 


1 272 


1-24 


0.953 j 


0.73 9 


283 


1-21 


0.906 


0.688 


1 284 


1-29 


0.997 j 


0.854 


290 


1-31 


0.986 J 


0.841 


j 302 


1-28 


0.980 j 


0.893 


J 304 


1-16 


0.907 1 


0.635 


j 312 


1-19 


0.993 0.976 


1 313 


1-17 


0.930 | 0.753 


323 


1-22 


0 .998 f 0 .909 


[ 324 


1-17 


"0.982 j 0.954 


j 328 


1-19 


0.971 0.865 


329 


1-22 


0.963 0.924 


j 330 


1-33 


0.978 


j 0.841 


331 


1-24 


0.920 


0.712 


j 332 


1-24 


0.975 


1 0.881 


| 333 


. 1-19 


0.984 


j 0.941 


334 


1-20 


0.899 


0.567 


j 335 


1-27 


0.942 


j 0.813 


j 336 


1-20 


0.952 


1 0.850 


337 


1-38 


0.942 


j 0.653 


j 338 


1-27 


0.973 


j 0.772 


339 


1-36 


0.979 


j 0.804 


1 340 


1-27 


0.886 


j 0.597 


j 343 


1-19 


0.971 


j 0.865 


1 344 


1-22 


0.994 


0.928 


j 345 


1-17 


0.966 


0.687 


j 346 


1-19 


0.936 


j 0.822 


1 347 


1-22 


0.963 


0.924 


1 349 


1-24 


0.982 


| 0.966 


[ 351 


j 1-21 


0.918 


0.815 


352 


1-31 


0.988 


j 0.912 


j 354 


1-31 


0.974 


0.839 


355 


1-29 


0.932 


j 0.632 


356 


1-15 


0.994 


j 0.969 






0.935 


0 .726 


360 


1-27 


0.938 


0.827 j 


361 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0 .788 


363 


1-21 


0.881 


0.715 


364 


1-33 


0.978 


0 .841 


1 365 


1-33 


0 .978 


0.841 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0.916 


0.820 


367 


1-19 


0.936 


0.822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0.920 


0.712 


371 


1-24 


0.961 


0.773 


372 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0.994 


0.932 


376 


1-34 


0.987 


0.810 


377 


1-17 


0.995 


0.950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.874 


381 


1-20 


0.928 


0 .782 


382 


1-19 


0.986 


0.934 


383 


1-28 


0.965 


0.829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0. 881 


388 


1-30 


0.989 


0.868 


389 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.968 


0.890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.985 


0.854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0.994 


0.921 


407 


1-35 


0.987 


0.658 


408 


1-39 


0.976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0.962 


411 


1-38 


0.977 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0.988 


0.965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.940 


417 


1-29 


0.941 


0.672 


■418 


1-20 


0.952 


0.850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0.889 


0.785 


422 


1-48 


0.982 


0.862 


424 


1-19 


0.979 


0.933 


428 


1-38 


0.942 


0.653 


430 


1-18 


0.947 


0.595 


432 


1-33 


0.957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


0.998 


0.977 


436 


1-27 


0.973 


0.772 


443 


1-15 


0.966 


0.940 


448 


1-36 


0.979 


0.804 


453 


1-41 


0.958 


0.609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 | 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN ArlJ-WU 


ui 3v n /MZVVTMTTM 1 P 


/leanS (MEAN 1 
SCORE) 


511 


1-23 


0 930 n 


3.593 J 


512 


1-2 J 


0.930 < 


0.593 


515 


l-lo 


0.978 


0.956 J 


523 


1-15 


0.936 | 


0.822 


529 


1-22 


0.963 j 


0.924 


545 


1-24 


0.982 _}_ 


0.966 


550 


1-3 0 


0.933 


0.713 


| 552 


1-21 


0.973 0.912 


554 


1-23 


0.969 


0.784 | 


j 571 


1-21 


0.918 1 


0.815 j 


| 574 


1-31 


0.988 T 


0.912 1 


J 580 


1-39 


0.925 1 


0.556 1 


|594 


1-31 


0.974 j 


0.839 | 


608 


1-2 9 


0.932 [ 


0.632 


] 609 


1-29 


0.932 0.632 


| 610 


1-21 


0.990 j 0.948 


621 


1-15 


0 .994 0.969 j 


623 


1-33 


0.935 0.726 ] 


j 653 


1-27 


0.938 0.827 


j 668 


1-22 


0.929 0.788 j 


j 677 


1-16 


0.948 


0.807 ] 


j 685 


1-21 


0.881 


0.715 


1 699 


i o o 
1-22 


0.975 1 


0.816 | 


j 702 


1-31 


0.968 \ 


0.898 1 


1 707 J 


l-lo 


0.880 j 


0.562 


1 713 


1-25 


0.966 ~\ 


0.743 j 


j 718 


1-19 


0.936 


0.822 j 


| 719 


1-20 


0.961 1 


0.824 


1 729 


1-29 


0.972 


0.874 1 


1 735 


1-46 


0.903 J 


0.598 1 


I 746 


1-14 


0.916 n 


0.730 


j 747 


1-22 


0.965 j 


0.876 


| 748 


1 n Oft 

1-29 


0.968 j 


0.785 _J 


| 759 


1-24 


0.961 | 


0.773 ~H 


1 767 


1-27 


0.919 


j 0.768 j 


[ 768 


1-33 


0.900 0.58b 


773 


mm 4 A 

1-42 


0.959 0.702 


1 779 


1-19 


0.986 0.945 


j 797 


1-19 


0.944 


| 0.759 ] 


798 


1-19 


0.900 


1 0.568 | 


| 820 


1-17 


0.995 


| 0.950 


j 827 


m Aft 

1-49 


0.971 


0.749 


1 848 


1-20 


0.968 


| 0.874 


1 864 


1-20 


0.928 


1 0- 782 


\ 866 


1-19 


0.986 


0.934 


S 873 


1-2J 


0.948 


0.886 


j 881 


n o a 
l-2b 


0.965 


0.829 


| 887 


1-39 


0.970 


j 0.551 


j 927 




0.989 


0.86B 


934 


1 /I D 


0.988 


0.777 


939 




0.994 


0.889 j 


944 


t o *c 


0.971 


0.782 


r> f f\ 

950 


1 — z, y 


0.957 


S 0.845 


963 


i on 
1-2U 


0.981 


0.900 


964 


i on 
1- Z\j 


0.886 


0.558 


973 


1-16 


0.968 


0.890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.822 




1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0.842 


1 1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


"' 0.935 
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SEQ ID NO 



1075 



1080 



1092 



POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 



1-27 



1-19 



1-19 



MaxS (MAXIMUM 
SCORE) 



0.992 



0.931 



0.991 



MeanS {MEAN 
SCORE) 



0.934 



0.829 



0.973 



1094 



1095 
1105 



1-46 



1-30 



0.992 



0.974 



0.653 



0.929 



1123 



1138 



1140 



1142 



1152 



1-23 



1-35 



1-32 



1-38 



1-33 



1-25 



0.994 



0.987 



0.954 



0.989 



0.897 



0.990 



0.921 



0.658 



0.613 



0.789 



0.570 



0.962 



1170 



1176 



1187 



1189 



1-38 



1-20 



1-20 



1-35 



0.977 



0.944 



0.988 



0.967 



0.827 



0.768 



0.965 



0.839 



1192 



1193 



1-46 



1-16 



0.993 



0.925 



0.638 



0.710 



1197 



1208 



1-29 



1-23 



0.985 



0.981 



0.853 



0.940 



1225 



1245 



1258 



1265 



1266 



1276 



1292 



1296 



1297 



1332 



1358 



1371 



1380 



1-29 



1-19 



1-29 



1-22 



1-20 



1-48 



1-19 



1-21 



1-19 



1-38 



1-18 



1-33 



1-26 



0.941 



0.986 



0.965 



0.889 



0.944 



0.982 



0.979 



0.984 



0.984 



0.942 



0.947 



0.957 



0.979 



0.672 



0.967 



0.861 



0.785 



0.809 



0.862 



0.933 



0.944 



0.953 



0.653 



0.595 



0.789 



0.904 



1397 



1-27 



0.962 



0.777 



1399 



1-23 



0.937 



0.960 



1404 



1410 



1414 



1415 



1416 



1418 



1420 



1421 



1423 



1424 



1425 



1426 



1428 



1-24 



1-15 



1-24 



1-19 



1-12 



1-30 



1-20 



1-19 



1-17 



1-21 



1-24 



1-24 



1-25 



0.998 
0.946 



0.913 



0.982 



0.931 



0.933 



0.881 



0.990 



0.968 



0.885 



0.913 



0.913 



0.967 



0.977 



0.845 



0.588 



0.929 



0.891 



0.563 



0.561 



0.968 



0. 863 



0.591 



0.588 



0.588 



0.899 



1430 



1431 



1432 



1433 



1434 



1435 



1-34 



1-28 



1-36 



1-32 



1-39 



1-25 



0.977 



0.979 



0.957 



0.921 



0.983 



0.910 



0.819 



0.923 



0.613 



0.753 



0.621 



0.631 



1436 



1437 



1-42 



1-22 



0.988 



0.998 



0.868 



0.980 



1442 



1448 



1462 
1490 



1-20 



1-12 



1-18 



0.918 



0.931 



0.968 



0 .753 



0.891 



0.888 



1518 
1525 
1547 
1561 
1580 
1593 



1-20 



1-17 
1-21 
1-28 
1-25 
1-17 
1-28 



0.881 



0.968 
0.885 
0.974 
0.967 
0.923 
0.979 



0.561 
0.863 
0.591 
0.891 
0.899 
0 .824 
0.923 
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SBQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1596 


1-16 


0 . 929 


0 . /09 


1601 


1-36 


/*% A f— ' n 

0 . 957 


U . b X J 


1606 


1-22 


0 . 979 


0.831 


1607 


1-20 


0 . 974 


0.770 


1608 


1-32 


6 . 921 


0 . /b3 


1614 


1-33 


0 . 969 


n bob 

0.829 ; 


1616 . 


1-20 


0 . 959 


f» or n 

0.869 


1625 




O 9ft^ 


0 .621 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0.897 


0.591 


1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 14 1 6234. 1 (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1 


1787 


3573 


5359 


784CIP2_1 


1103 


2 


1 1788 


3574 


5360 


784CIP2_2 


2673 


3 


17B9 


357.5 


5361 


784CIP2JJ 


4117 


4 


1790 


3576 


5362 


784CIP2_4 


5556 


5 


1791 


3577 


5363 


784CIP2_5 


5562 


6 


1792 


3578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CIP2_7 


5562 


8 


1794 


3580 


5366 


784CIP2_8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


I 5563 


10 


1796 


3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


5369 


784CIP2JL1 


5565 


12 


1798 


3584 


5370 


784CIP2JL2 


5689 


13 


1799 


3585 


5371 


784CIP2JL3 


5729 


14 


1800 


3586 


5372 


784CIP2_14 


5745 J 


15 


1801 


3587 


5373 


784CIP2JL5 


5777 


16 


1802 


3588 


5374 


784CIP2JL6 


5777 


17 


1803 


3589 


5375 


784CIP2JL7 


5789 


18 


1804 


3590 


5376 


784CIP2JL8 


5792 


19 


1805 


3591 


5377 


784CIP2 19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


180B 


| 3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2_23 


5844 


24 


1810 


3596 


5382 


784CIP2_24 


5850 


25 


1811 


3597 


5383 


j 784CIP2_25 


5867 


26 


1812 


3598 


5384 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


j 5995 i 


28 


1814 


3600 


5386 


784CIP2_28 


| 5995 


29 


1815 


3601 


5387 


784CIP2_29 


6005 


30 


1816 


3602 


5388 


784CIP2 30 


6007 


31 


1817' 


3603 


5389 


784CIP2_31 


6007 


32 


1818 


3604 


5390 


784CIP2_32 


6009 


33 


1819 


3605 


5391 


784CIP2_33 


6012 




1820 


3606 


5392 


784CIP2 34 


6015 


. 35 


1821 


3607 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 


1823 


3609 


5395 


7B4CIP2_37 


6018 1 


38 


1824 


3610 


5396 


784CIP2_38 


6018 


I 39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


784CIP2_40 


6023 


41 


1827 


3613 


5399 


784CIP2_41 


6070 


42 


1828 


3614 


5400 


784CIP2_42 


6081 


43 


1829 


3615 


5401 


784CIP2_43 


6089 


44 


1830 


3616 


5402 


784CIP2_44 


6118 | 


45 


1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2 46 


6130 


47 


1833 


3619 


5405 


784CIP2_4 7 


6177 


48 


1834 


3620 


5406 


784CIP2_4B 


6189 


49 


1835 


3621 


5407 


784CIP2_49 


6191 


50 


1836 


3622 


5408 


784CIP2_50 


6204 


51 


1837 


3623 


5409 


784CIP2_51 • 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2 54 


6436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784CIP2 58 


6458 


59 


1845 


3631 


5417 


784CIP2_59 


6458 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 
. 


SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 

of con tig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID \ 
NO: in 
U.S. S.N. 
09/488,725 


60 


1846 


3632 


5418 


784CIP2__60 


C A C"~i 1 


61 


1847 j 3633 


5419 


784CIP2_61 


b<tJZ 


62 


1848 1 


3634 


5420 


784CIP2_62 


f A o O 

64 99 


63 


1849 | 


3635 


5421 


784CIP2 63 


IT A Q Q i 

64 99 | 

>^ r rt r "~| 


i 64 


1850 | 


3636 


5422 


784CIP2 64 


6505 j 


65 


1851 1 


3637 


5423 


784CIP2_65 


6534 j 


1 66 


1852 j 


3638 


5424 


784CIP2 66 


6534 | 


1 67 


1853 | 


3639 


5425 


784CIP2 67 


6540 j 

r e c rt I 


| 68 


1854 j 


3640 


5426 


784CIP2_68 


6550 j 


r 69 


1B5S 


3641 


5427 


784CIP2_69 


6550 | 

r~ O O I 


j 70 


1856 | 


3642 


5428 


784CIP2_ - 70 


6592 j 


( 71 


1857 


3643 


5429 


784CIP2 71 


6645 

/" r7 ^ I 


1 " 72 


1858 j 


3644 


5430 


784CIP2 72 


66 71 | 


i 73 


1859 j 


3645 


5431 


784CIP2_73 


6763 | 

/— /- -) 1 


| 74 


1860 | 


3646 


5432 


784CIP2_74 


6763 | 


1 75 


1851 1 


3647 


5433 


784CIP2_J75 


6786 j 


1 76 


1862 | 


3648 


5434 


784CIP2_76 


CI rt Jl 1 

6824 ] 

rt rt rt 1 


j 77 


1853 j 


3649 


5435 


784CIP2 77 


6830 | 


78 


1864 | 


3650 


5436 


784CIP2_78 


6831 | 

/*» o o ^ I 


79 


1865 j 


3651 


5437 


784CIP2_79 


6832 

^ rt *i ii 1 


! 80 


1866 | 


3652 


5438 


784CIP2_80 


6834 j 

a *> A I 


81 


1867 | 


3653 


5439 


784CIP2_81 


6834 1 


82 


1868 | 


3654 


5440 


784CIP2_82 


6835 j 

y rt 1 i 


83 


1869 


3655 


5441 


784CIP2_83 


6837 j 

^ rt ^ *^ I 


i 84 


1870 | 3656 


5442 


784CIP2_84 


6843 j 


85 


1871 j 3657 


5443 


784CIP2_85 


<• rt ^ rt 1 

6859 | 


86 


1872 3658 


5444 


784CIP2_B6 


6915 j 


| 87 


1873 


3659 


5445 


784CIP2_B7 


"1 o 1 

6932 | 

rt ^> i 


j 88 


1874 


3660 


5446 


784CIP2_38 


6957 

M Jf* M ^ 


j 89 


1875 


| 3661 


5447 


784CIP2__89 


6961 1 

^ r\ *^ *^ 1 


i 90 


1876 


[ 3662 


5448 


784CIP2_90 


6973 j 

^\ 1 


1 91 


1877 


3663 


5449 


784CIP2_91 


6973 j 


j 92 


1878 


3664 


5450 


784CIP2_93 


7007 j 

n A rt $ 




. 1879 


! 3665 


5451 


784CIP2_94 


7018 j 


94 


1880 


3656 


5452 


784CIP2_95 


7019 | 


j 95 


1881 


1 3667 


5453 


784CIP2_96 


7020 | 

rt *^ rt 1 


j 96 


1882 


( 3658 


5454 


784CIP2_97 


7020 j 


3 97 


1883 


| 3659 


5455 


784CIP2_98 


7021 J 


j 98 


1884 


( 3670 


5456 


784CIP2_99 


7023 | 

rt o rt) 1 


j 99 


1885 


| 3671 


5457 


784CIP2_100 


7027 | 

^ rt 1 O 1 


j 100 


1886 


367Z 


5458 


784CIP2_101 


7028 | 

r7 rt 1 Pi 1 


101 


1887 


j 3673 


| 5459 


784CIP2_102 


7029 I 


1 102 


1888 


! 3674 


5460 


784CIP2_103 


't rt o i 1 

7031 j 


1 103 


1889 


] 3675 


5461 


1 784CIP2_104 


7032 


\ 104 


1890 


3676 


5462 


784CIP2_105 


/ \J 36 


j 105 


1891 


j 3677 


5463 


784CIP2_106 


H rt "2 C 


| 106 


1892 


3678 


5464 


784CIP2_107 


*7rt*5 C 


| 107 


1893 


1 3679 


5465 


784CIP2_108 


«-> <-\ O Q 


I 1° 8 


1894 


j 3680 


5466 


784CIP2_109 


/U43 


i 109 


1895 


j 3681 


5467 


784CIP2 110 


•7 ^ 

/ 1/4* fl 


J 110 


1896 


j 3682 


5468 


784CIP2_111 


/ U*lfa 


I HI 


1897 


1 3683 


5469 


784CIP2_112 


n r\CA 
70541 


1 112 


1898 


1 3684 


5470 


784CIP2_113 


/Ofal 


\ 113 


1899 


j 3685 


5471 


784CIP2 114 


/ 0 / / 


1 114 


[ 1900 


| 3686 


5472 


784CIP2_115 


o r\ rt 

7092 


1 115 


1901 


| 3687 


5473 


/ o t ±\*Jir£ IIP 


7094 


116 


; 1902 


( 3688 


5474 


784CIP2__117 


7106 


j 117 


1903 


| 3689 


5475 


784CIP2_118 


7107 


| 118 


1904 


j 3690 


5476 


784CIP2 119 


7111 


| 119 


1905 


3691 


5477 ' 


784CIP2_120 


7123 


| 120 


1906 


j 3692 


5478 


784CIP2_121 


7142 


j 121 


1907 


| 3693 


5479 


784CIP2 122 


7142 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




122 


1908 


3694 


5480 


784CIP2_123 


7154 


123 


1909 


3695 


5481 


784CIP2_124 


7160 


124 


1910 


3696 


5482 


784CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2_126 


7185 


126 


1912 


3698 


5484 


784CIP2_127 


7197 


127 


1913 


3699 


5485 


784CIP2 128 


7219 


128 


1914 


'3700 


5486 


784CIP2_129 


7226 


129 


1915 


3701 


5487 


784CIP2_130 


7229 


130 


1916 


3702 


5488 


784CIP2_131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


j 5490 


784CIP2_133 


7235 


133 


1919 


3705 


5491 


784CIP2_134 


723 8 


134 


1920 


3706 


5492 


784CIP2_135 


7247 


135 


1921 


3707 


5493 


784CIP2_136 


7261 


13 6 


1922 


3708 


5494 


784CIP2_137 


| 7262 


13 7 


1923 


3709 


5495 


784CIP2_138 


7267 


138 


1924 


3710 


5496 


j 784CIP2_139 


7272 


139 


1925 


3711 


5497 


784CIP2_140 


7273 


14 0 ; 


1926 


3712 


5498 . 


784CIP2_141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


192B 


3714 


5500 


784CIP2_143 


7291 


143 


1929 


3715 


5501 


784CIP2JL44 


7293 


144 


1930 


3716 


■ 5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


784CIP2_146 


7299 


146 


1932 


3718 


5504 


784CIP2_147 


7300 


147 


1933 


3719 


5505 


784CIP2_148 


7312 


148 


1934 


3720 


5506 


784CIP2_149 


7313 


149 


1935 


3721 


5507 


i 784CIP2_150 


7315 


150 


1936 


3722 


5508 


784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2JL52 


7321 


152 


1938 


3724 


5510 


784CIP2_153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2_155 


7333 


155 


1941 


3727 


5513 


784CIP2JL56 


7350 


156 


1942 


3728 


5514 


784CIP2JL57 


7352 


157 


1943 


3729 


5515 


784CIP2_158 


7384 j 


158 


1944 


3730 


5516 


784CIP2_159 


7403 


159 


1945 


3731 


5517 


784CIP2_160 


7431 


160 


1946 


3732 


5518 


784CIP2_161 


7441 


161 

■ 


1947 


3733 


5519 


784CIP2_162 


7453 


162 


1948 


3734 


5520 


784CIP2 163 


7467 


163 


1949 


3735 


5521 


784CIP2_164 


7471 


164 


1950 


3 736 


5522 


784CIP2_165 


7493 


165 


1951 


3737 


5523 


784CIP2 166 


7502 


166 


1952 [ 


3738 


5524 


784CIP2_167 


7511 


167 


1953 


3739 


5525 


784CIP2_168 


7514 


168 


1954 


3740 


5526 


784CIP2_169 


7520 


169 


1955 


3741 


5527 


784CIP2_170 


7541 


170 


1956 


3742 


5528 


784CIP2_171 


7570 


171 


1957 


3743 


5529 


784CIP2_172 


7578 


172 


1958 


3744 


5530 


784CIP2 173 


7583 


173 


1959 


3745 


5531 


784CIP2_174 


7592 


174 


1960 


3746 


5532 


784CIP2_175 


7601 


175 


1961 


3747 


5533 


784CIP2_176 


7602 


176 


1962 


3748 


5534 


784CIP2_177 


7608 


177 


1963 


3749 


5535 


784CIP2_178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3751 


5537 


784CIP2_181 


7624 


180 


1966 


3752 


5538 


784C3P2 182 


7626 


181 


1967 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


784CIP2_184 


7641 


183 


1969 


3755 


5541 


784CIP2_185 


7641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 



184 



185 



186 



187 



188 



189 



190 



191 



192 



193 



194 



195 



196 



197 



SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 



1970 



1971 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



1972 



1973 



1974 



1975 



1976 



1977 



1978 



1979 



1980 



1981 



1982 



1983 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



3756 



5542 



3757 



5543 



3758 



5544 



3759 



5545 



3760 



5546 



3761 



5547 



3762 



5548 



3763 



5549 



3764 



5550 



3765 



5551 



3766 



5552 



3767 



5553 



3768 



5554 



3769 



5555 



Priority 
docket number, 
corresponding 
SEQ ID NO: in 
priority 
application 



784CIP2 186 



784CIP2 187 



7S4CIP2 188 



784CIP2 189 



784CIP2 190 



784CIP2 191 



784CIP2 192 



784CIP2 193 



784CIP2 194 



784CIP2 195 



784CIP2 196 



.SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



784CIP2 197 



784CIP2 198 



784CIP2_199 
784CIP2 200 



7641 



7642 



7649 



7656 



7657 



7657 



7662 
7668 



7673 



7690 
7700 



7709 
7736 



7737 
7744 



198 



199 



200 



201 



202 



1984 



1985 



1986 



1987 



1988 



3770 



5556 



3771 



5557 



784CIP2 201 



3772 



5558 



784CIP2 202 



3773 



5559 



784CIP2 203 



3774 



5560 



784CIP2_204 
784CIP2 205 



7771 



7786 



7791 



7797 
7806 



203 



204 



205 



206 



1989 



1990 



1991 



1992 



3775 



5561 



3776 



5562 



784CIP2 206 



3777 



5563 



784CIP2 207 



3778 



5564 



784CIP2 208 



7812 
7812 



7818 



207 


1993 j 


3779 


5565 | 




7822 

/ V mm *m 


208 | 


1994 | 


3780 j 


r r f r 

bbbo 1 




7827 


209 | 


1995 1 


3781 j 


I" C t~ '"I I 

5567 | 


n a act "DO ?"\ 1 1 


■ 783 0 

/ w w 1 V 

rt *> C 


210 \ 


1996 


3782 j 


5568 j 


784CIP2J212 \ 


7835 


211 j 


1997 1 


3783 | 


5569 | 


784CIP2_214 


7840 


212 ] 


1998 j 


3784 j 


5570 j 


784CIP2_215 j 


7858 


213 T 


1999 | 


3785 


5571 j 


784CIP2_216 | 


7858 


214 1 


2000 j 


3786 


5572 


784CIP2_217 | 


7861 


215 j 


2001 | 


3787 1 


5573 j 


784CIP2_218 j 


7866 


I - 216 ! 


2002 j 


3788 j 


5574 | 


784CIP2_219 | 


78^8 


r 217 j 


2003 j 


3789 


5575 | 


784CIP2_220 j 


7896 


! 218 I 


2004 j 


3790 j 


5576 


784CIP2_221 


7898 


I 219 i 


2005 j 


3791 


5577 j 


784CIP2_222 J 


7900 


1 220 i 


2006 j 


3792 


5578 i 


784CIP2_223 J 


7906 


! 221 s 


2007 


3793 \ 


5579 j 


784CIP2J224 j 


7908 


j 222 


2008 


3794 ~1 


5580 


784CIP2_225 


7909 


j 223 


2009 


3795 ! 


5581 


784CIP2_226 


7917 


j 224 


2010 


3796 


5582 


784CIP2_227 \ 


7932 


j 225 


2011 


3797 


5583 


784CIP2_228 


7940 


j 226 


2012 


i 3798 


5584 


784CIP2J229 


7940 


[ 227 


2013 


j 3799 


5585 


, 784CIP2_230 


s 7984 


[ 228 


2014 


3800 


5586 


[ 784CIP2_231 


7984 


| 229 


2015 


f 3801 


| 5587 


1 784CIP2_232 


j 8001 


j 230 


2016 


| 3802 


| 5588 


[ 784CIP2_233 


j 8021 


j 231 


[ 2017 


S 3803 


j 5589 


| 784CIP2_234 


j 8029 


j 232 


2018 


3804 


| 5590 


j 784CIP2_235 


j 8033 


[ 233 


2019 


j 3805 


j 5591 


| 784CIP2J236 


; 8040 


| 234 


2020 


1 3806 


j 5592 


| 784CIP2_237 


1 8052 


[ 235 


2021 


\ 3807 


j 5593 


1 784CIP2_238 


8096 


\ 236 


2022 


1 3808 


| 5594 


j 784CIP2_239 


\ 8096 


| 237 


2023 


! 3809 


| 5595 


| 784CIP2J240 


1 8113 


1 238 


2024 


1 3810 


| 5596 


| 784CIP2_241 


1 8126 


J 239 


2025 


j 3811 


| 5597 


784CIP2_242 


J 8132 


| 240 


2026 


I 3812 


| 5598 


j 784CIP2 243 


j 8137 


1 241 


2027 


I 3813 


j 5599 


! 784CIP2_244 


j 8137 


j 242 


2028 


3814 


\ 5600 


j 784CIP2 245 


8159 


| 243 


2029 


| 3815 


5601 


| 784CIP2_246 


j 8159 


( 244 


2030 


i 3816 


5602 


j 784CIP2_24"/ 


"I 8161 


i 245 


2031 


3817 


[ 5603 


784CIP2 248 


J 8176 
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WO 01/53312 
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<?RO ID NO* 




<5EO TD "NKV 


OQU -Li-' 






of full- 


NO * of 


of contici 


NO • 


docket niiTnViF»r* 


NO - in 


length 


full- 


nucleotide 


of contici 


corresoondina 


U S.S.N 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488 , 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




246 


2032 


3818 


5604 


784CIP2_249 


8196 


247 


2033 


3819 


5605 


784CIP2 250 


8200 


248 


2034 


j 3820 


| 5606 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


784CIP2_252 


8220 


250 


2036 


3822 


1 5608 


784CIP2_253 


8238 


251 


2037 


3823 


5609 


784CIP2 254 


8254 


252 


2038 


3824 


1 5610 


784CIP2J255 


8255 


253 


2039 


3825 


5611 


784CIP2_256 


8288 


254 


2040 


3826 


j 5612 


784CIP2_257 


8296 


255 


2041 


3827 


! 5613 


784CIP2_258 


8329 


256 


2042 


3828 


5614 


784CIP2_259 


8362 


257 


2043 


3829 


5615 


784CIP2_260 


8429 


258 


2044 


3830 


5616 


784CIP2_261 


8436 ' 


259 


2045 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


784CIP2_263 


8472 


261 


2047 


3833 


5619 


784CIP2_264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2_266 


8507 


264 : 


2050 


3836 


5622 


784CIP2J268 


8509 


265 


2051 


3837 


5623 


784CIP2_269 


8515 


266 


2052 


3838 


5624 


784CIP2_270 


8519 


267 


2053 


1 3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784CIP2_272 


8532 


269 


2055 


3841 


5627 


784CIP2_273 


8532 


270 


2056 


3842 


5628 


784CIP2_274 


8539 


271 . 


2057 


3843 


5629 


784CIP2__275 


8541 


272 


2058 


3844 


5630 


784CIP2_276 


8543 


273 


2059 


3845 


| 5631 


784CIP2_277 


8593 


274 


2060 


3846 


5632 


784CIP2_278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2_280 


6620 


277 


2063 


3849 


5635 


784CIP2_281 


8621 


j 278 


2064 


3850 


5636 


784CIP2 282 


8623 


279 


2065 


3851 


5637 


784CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2_284 


8628 


281 


2067 


3853 


5639 


784CIP2_285 


8628 


282 


2068 


3854 


5640 


784CIP2_286 


8629 


283 


2069 


3855 


5641 


784CIP2_287 


8630 


284 


2070 


3856 


5642 


784CIP2_288 


8631 


285 


2071 


3857 


- 5643 


784CIP2 289 


8633 


286 


2072 


3858 


5644 


784CIP2_290 


8634 


287 


2073 


3859 


5645 


784CIP2_291 


8635 


288 


2074 


3860 


5646 


784CIP2_292 


8636 


289 


2075 


3861 


5647 


784CIP2_293 


8659 


290 


2076 


3862 


5648 


784CIP2_294 


8660 


291 


2077 


3863 


5649 


784CIP2_295 


8667 


292 


2078 


3864 


5650 


784CIP2_296 


8667 


293 


2079 


3865 


5651 


784CIP2_297 


8685 


294 


2080 


3866 


5652 


784CIP2_298 


8805 j 


295 


2081 


3867 


5653 


784CIP2_299 


8896 


296 


2062 


3868 


5654 


784CIP2_300 


8978 


297 


2083 


3869 


5655 


784CIP2_301 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


9048 


299 


2085 


3871 


5657 


784CIP2_303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2__305 


9201 1 


302 


2088 


3874 


5660 


784CIP2__306 


9307 


303 


2089 


3875 


5661 


784CIP2J307 


9321 


304 


2090 


3876 


5662 


784CIP2_308 


9397 


305 


2091 


3877 


5663 


784CIP2_309 


9405 


306 


2092 


3878 


5664 


784CIP2_310 


9406 


307 


2093 


3879 


56^5 


784CIP2 311 


9422 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length' 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority ] 
docket number — j 
corresponding 

nnA Tn XT/% J 1 

SEQ ID NO: in 
priority 

application ( 


SEQ ID 
NO: in 
U.S. S.N. 

n Q / A Q Q TIC 


308 


2094 


3880 


C f f t? 

5666 






309 


2095 


3881 


5667 


/o4l~lr'<& Jij j 




310 


2096 


3882 


5668 


/O'lvlr^ Jlfi I 




311 


2097 


3883 


5669 


784CIP2_315 ~j~ 


9661 


312 


2098 


3884 


5670 


784CIP2_316 | 


9664 


313 


2099 


3885 


5671 


784CIP2_317 j 


9691 


314 


2100 


3886 


5672 


■ 784CIP2 318 


9700 


315 


2101 


3887 


5673 


784CIP2_319 j 


9716 


316 


2102 


3888 


5674 


784CIP2_320 j 


9721 


317 


2103 


3889 


5675 


784CIP2__321 | 98/0 


318 


2104 


3890 


5676 


784CIP2 322 j 9887 


319 


2105 


3891 


5677 


784CIP2_323 | 9923 


320 


2106 


3892 


5678 


784CIP2_324 j 


9938 


321 


2107 


3893 


5679 


784CIP2_325 | 


9964 


322 


2108 


3894 


5680 


784CIP2__326 | 


10007 ! 


323 


2109 


3895 


5681 


784GIP2_327 j 


10009 


324 


2110 


3896 


r- r~ /"7 

5682 


784CIP2_328 j 


10046 


325 


2111 


3897 


5683 


784CIP2_329 j 


10156 


326 


2112 


3898 


5684 


784CIP2_330 j 


10276 


327 


2113 


3899 


5685 


784CIP2__331 j 


10283 


328 


2114 


3900 


5686 


784CIP2B_1 j 


152 


329 


2115 


_ — jh mm 

3901 


5687 


784CIP2B_2 j 


167 


330 


211S 


3902 


5688 


784CIP2B_3 ] 


205 


331 


2117 


3903 


5689 


784CIP2B_4 j 


210 


332 


2118 


3904 


5690 


784CIP2B_5 j 


225 


333 


2119 


3905 


5691 


784CIP2B_6 


226 


'334 


2120 


3906 


5692 


784CIP2B_7 


264 


335 


2121 


3907 


^ ^ 

5693 


784CIP2B 8 


268 


I 336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 


5695 


784CIP2B_10 


293 


j 338 


2124 


3910 


5696 


784CIP2B_11 


j 293 


339 


2125 


3911 


5697 


784CIP2B_12 


| 302 


340 


2126 


3 912 


5698 


784CIP2B_13 


! 311 ! 


341 


2127 


3913 


5699 


784CIP2B_14 j 352 


342 


2128 


3914 


5700 


784CIP2B_15 


358 


343 


2129 


3915 


5701 


784CIP2B_16 


36B 


344 


2130 


3916 


5702 


784CIP2B_17 


3 93 


345 


2131 


3917 


5703 


784CIP2B__18 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


I 2133 


3919 


5705 


784CIP2B_20 


508 


348 


2134 


3920 


5706 


784CIP2B_21 


£15 


349 


2135 


3921 


5707 


784CIP2B_22 


578 


350 


2136 


3922 


5708 


784CIP2B_23 


588 


351 


2137 


3923 


5709 


784CIP2B_24 


591 


! 352 


2138 


3924 


5710 


784CIP2B_25 


593 


353 


2139 


m% f± t\ 

3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B27 


619 


355 


2141 


3927 


5713 


784CIP2B_28 


620 


356 


2142 


3928 


5714 


784CIP2B_29 


654 


357 


2143 


3929 


5715 


784CIP2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B_34 


833 


362 


2148 


3934 


5720 


7B4CIP2B_35 


838 


363 


.2149 


7 Q 1 K 


D / X 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP23_37 


891 


365 


2151 


3937 


5723 


784CIP2B_3 8 


891 


366 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B_41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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SEO ID NO • 






ocy xu 


[l mm* J -am** *y ^ * m> 

JfiJ.Oj.lCy 


oEQ ID 


of full- 


NO - of 


rtf fori Hn 


HA • 


Uvj wIVB U HUUUJtiX. 


vi\j : in 


length 


full- 

mam \^ *mmi aMm 


nucleot idp 


of eonticr 


r , OT"r*pf?T)rjTid t tin 


n c c w 


nucleotide 


length 


secruence 

■ * m yam* m*m m.aama * 


nentide 


SEO ID NO* in 

**a* 4J, k* mmm**** *-* V-* • +m\m * * 


09/488 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




370 


| 2156 


3942 


5728 


784CIP2B_43 


958 


371 


2157 


3943 


5729 


784CIP2B_44 


968 


372 


2158 


S 3944 


5730 


784CIP2B 45 


992 


3 73 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 

mmm ~m¥ § ^fc 


375 


2161 


3947 


5733 


784CIP2B 48 

mm mm* ^* mmw * 


1104 

mmm mmm \0 ^m> 


376 


2162 


3948 


5734 


784CIP2B 49 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


3951 


5737 


784CIP2B 52 


1318 

mm mm* mmm W 


380 


2166 


3952 


5738 


784CIP2B 53 


1319 

mmm warn* mama mm* 


381 


2167 


3953 


5739 


784CIP2B 54 

* ^ ™* ™* ^» mmm m**w mm*' 


1328 

•mm mmW mmm w 


382 


2168 


3954 


5740 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 

V A ^ P* mm* mmm* ***** ***r 


1464 

*a\a * V * 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 

mam, —m \mV ^ 


385 


2171 


3957 


5743 


784CIP2B 58 

V *** * « mmm) mmmf ***0 \m* 


1617 

mmm S*f mmm, f 


386 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B 60 

mm m ^ ^4 mmmf 


1728 

•mmm 9 mmt V 


388 


2174 


3960 


5746 


784CIP2B 61 


1772 

mmm ¥ ¥ mmm 


389 


2175 


3961 


5747 


784CIP2B 62 


1809 

mmm mm* m0 mw 


390 


2176 


3962 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


784CIP2B 64 

m mm* *tf m** mm ^9 m*mr mam ma. 


1898 

* ^m* mm9 \mf 


392 


2178 


3964 


5750 


784CIP2B 65 

9> m*w ^ ^*+* mam mm mmmf mmm* ^a* ^mw 


1926 

mmm mm* 4*0 \J 


393 


2179 


3965 


5751 


784CIP2B 66 


1965 

mmm mm* \*f ma* 


394 


2180 


3966 


5752 


784CIP2B 67 

W w t# mam mm mmmt mmm* mm* ™ 


1967 

mim mm* W f 


395 


2181 


3967 


5753 


784CIP2B 68 

~ • ^0 mm *m mm* ***-* mmf m0 


1995 

A mma* 


396 


2182 


3968 


5754 


784CIP2B 69 

¥ W** -m *k ^ C * ■* m*W 


2005 

mm* >mf \J ma/ t 


397 


2183 


3969 


5755 


784CIP2B 70 

r W Ja Wm* m%m -ma Ctf mmf * v 


2027 

u W Jb # 


398 


2184 


3970 


5756 


784CIP2B 71 


2055 


399 


2185 


3971 


5757 


784CIP2B 72 


2103 


400 


2186 


3972 


5758 


784CIP2B 73 


2106 


401 


2187 


3973 


5759 

mm" 9 mtm muT 


784CIP2B 74 

t \m* mam Mm mm* W / 


2166 

jCi Jk. \J U 


402 


2188 


3974 


5760 


784CIP2B 75 


2175 


403 


2189 


3975 


5761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 

* mm? m% ^m* m*+ 0m ^*t MarnW & \aW 


2236 

*-* mmf \*T 


405 


2191 


3977 


5763 

ma* ¥ >*¥ 'mm* 


784CIP2B 79 

9 W *aa >maP mm* am. thaw 9mm* W mW 


2250 


406 


2192 


3978 


5764 

mm* 0 \*f -m. 


784CIP2B 80 

9 mat ^ -* LJ \# 


2300 


407 


2193 


3979 


• 5765 


784CIP2B 81 


2323 

« *J ^ mm* 


408 


2194 


3980 


5766 


784CIP23 82 

• m+ **i*> ^r* mm *gj m,m+ mar mmm 


2340 

* VmW 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 

Cat mmf f JL 


410 


2196 


3982 


5768 


784CIP23 84 


2399 


411 


2197 


3983 


5769 


784CIP2B_85 


2411 


412 


2198 


3984 


5770 


784CIP2B_86 


2428 


413 


2199 


3985 


5771 


784CIP2B 87 


2430 


414 


2200 


3986 


5772 


7B4CIP2B_88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784CIP2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B_92 


2492 


419 


2205 


3991 


5777 


784CIP2B 93 


2512 


420 


2206 


3992 


5778 


784CIP2B_94 


2564 


421 


2207 


3993 


5779 


784CIP2B_ 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 

¥ m*¥ mm ^m? mmm mm. *mmt mmmw m0T *m0 


2816 

mmM mmf mm> mam 


423 


2209 


3995 


5781 


784CIP2B 97 

r maw am mmm mat Mmm* 9m*¥ mm* m 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 

■ *m¥ 0*- ^*0* mmm\ mm mmm 0waW mm "ma¥ 


2819 

mmT mmm *0 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CrP2B 100 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B 104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 



432 



SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



2218 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



4004 



5790 



Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority- 
application 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



784CIP2B_106 
784CIP2B 107 



3417 
3418 



433 



434 



435 



436 



437 



438 



2219 



2220 



2221 



2222 



2223 



2224 



4005 



5791 



4006 



4007 



5792 
5793 



784CIP2B 108 



4008 



5794 



4009 



5795 



4010 



5796 



784CIP2BJL09 
784CIP2B 110 



784CIP2B 111 



784CIP2B_112 
784CIP2B 113 



3442 



3442 
3444 



3855 



3863 
4090 



439 



440 



441 



442 



443 



444 



445 
"446" 



447 



448 



449 
"450" 



451 



452 



453 



2225 



2226 



2227 



2228 



2229 



2230 



2231 
2232 



2233 



2234 



2235 
2236 



2237 



2238 



2239 



4011 



5797 



4012 



5798 



784CIP2B 114 



4013 



5799 



784CIP2B 115 



4014 



5800 



4015 



5801 



784CIP2BJU6 
784CIP2B 117 



4016 



5802 



784CIP2B 118 



4017 
"4018 



5803 
5804 



4019 



5805 



4020 



5806 



4021 
4022 



5807 
5808 



4023 



5809 



4024 



5810 



4025 



5811 



784CIP2B119 
784CIP2B 120 



784CIP2B 121 



784CIP2B122 
784CIP2B 123 



784CIP2B 124 



784CIP2B 125 



784CIP2B 126 



784CIP2B 127 



4105 
4142 



4142 
4149 



4196 



4202 
4274 



4304 
4306 



4311 
4321 



4323 



4332 



4488 



|~ 454 


2240 I 


A ftOC 1 


CPIO 

J O J- £• 


784CIP2B_128 j 


4588 


455 


2241 | 


a rm 1 
/ j 


— - 


784CIP2B129) 35b? 


| 456 


2242 | 






TO /I PT IJ ;u i in i 

/o4Llr4D jljU j 




1 457 


2243 | 








5577 


1 458 


2244 j 


a f\ rv t 
4030 j 


DOiD 


7B4CIP2B_132 | 


5579 j 


459 


2 Z4r> j 


UJ JL 1 


5817 


784CIP2BJL33 


5582 1 


1 460 


2246 J 


4032 | 


5818 


" 784CIP2B_134 


5583 | 

5584 j 


| 461 


2247 \ 


" 4033 f 


~~~ 5819 


784CIP2BJL35 | 




[ 462 


2248 


4034 


5820 


784CIP2B_136 j 


5585 


463 


2249 j 


4035 


5821 


784CIP2B_137 j 


5591 J 


[ 464 


2250 ; 


4036 j 


5822 


784CIP2B_138 j 


5593 ] 


465 


2251 J 


4037 j 


5823 


784CIP2B 139 I 


5594 


1 466 


2252 | 


4038 j 


5824 


784CIP2B_140 | 


5594 1 


1 467 


~ 2253 | 


~~ 4039 j 


582 5 


784CIP2B_141 | 


5598 


1 468 


2254 j 


4040 j 


5826 


7B4CIP2BJ142 | 


5602 | 


1 469 


2255 


4041 1 


5827 


784CIP2B_JL43 [ 


5605 j 


| 470 


2251 ~| 


4042 j 


5828 


784CIP2B_144 1 


5608 ) 


1 471 


2257 


4043 1 


5829 


784CIP2B_145 | 


5617 j 


| 472 


225B ~ 


4044 


5830 


784CIP2BJL46 ""j 


5620 j 


473 


2259 


4045 


5831 


784CIP2BJL47 \ 


5622 


j 474 


2260 


4046 


5832 


784CIP2B_148 


5623 j 




2261 


i 4047 


5833 


784CIP2BJL49 l 


5624 j 


1 476 


2262 


4048 


5834 


784CIP2B_150 ! 


. 5625 1 


I 477 


2263 


| 4049 


, 5835 


784CIP2B_151 


| 5627 j 


478 


2264 


1 4050 


5836 


784CIP2B_152 


5628 1 


1 479 


. 2265 


1 4051 


5837 


784CIP2B_153 


5630 | 


j 480 


~"~ 2266 " 


1 4052 


5838 


784CIP2B_154 


5632 


j 481 


2267 


] 4053 


I 5839 


784CIP2B 155 


1 5640 j 


L 482 


2268 


J 4054 


j 5840 


784CIP2B_156 


j 5641 j 


J 483 


2269 


j 4055 


j 5841 


784CIP2B 157 


j 5643 j 


j 484 


2270 


| 4056 


j 5842 


784CIP2B_158 


1 5647 I 


[ 485 


2271 


| 4 057 


5843 


784CIP2B_159 


j 5649 j 


\ 486 


2272 


! 4058 


j 5844 


784CIP2B 160 


! 5658 ; 


j 487 


2273 


j 4059 


5845 


784CIP2B_161 


I 5659 | 


| 488 


2274 


| 4060 


j 5846 


784CIP2B_162 


j 5667 1 


j 489 


2275 


j 4061 


I 5847 


784CIP2B_163 


\ 5672 1 


j 490 


2276 


j 4062 


j 5848 


784CIP2B 154 


1 5674 f 


I Jk9X 


2277 


1 4063 


j 5849 


784CIP2B 165 


5678 


| 492 


2278 


j 4064 


j 5850 


784CIP2B__156 


5680 


j 493 


2279 


| 4065 


j 5B51 


784CIP2B 167 


] 5684 1 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/4B8,725 


sequence 


peptide 




sequence 


priority 




■ 


sequence 






application 




494 


2280 


4066 


5852 


784CIP2BJL68 


5686 


495 


| 2281 


4067 


5853 


784CIP2B_169 


5694 


496 


2282 


4068 


5854 


784CIP2BJL70 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


2284 


4070 


5856 


784CIP2B_172 


5712 


499 


2285 


4071 


5857 


784CIP2B_173 


5719 


500 


2286 


4072 


5858 


784CIP2B_174 


5720 


501 


2287 


4073 


5859 


784CIP2BJL75 


5727 


502 


2288 


4074 


5860 


784CIP2B 176 


5730 


503 


2289 


4075 


5861 


784CIP2B 177 


5734 


504 


2290 


4076 


5862 


784CIP2BJL78 


5738 


505 


2291 


4077 


5863 


784CIP2B_179 


5739 


506 


2292 


4078 


5864 


784CIP23_180 


5740 


507 


2293 


4079 


5865 


784CIP2BJL81 


5744 


508 


2294 


4080 


5866 


784CIP2B_182 


5748 


509 


2295 


4081 


5867 


784CIP23_183 


5749 


510 


2296 


4082 


5868 


784CIP2B_184 


5750 


511 


2297 


4083 


5869 


784CIP2B_185 


5750 


512 


2298 


4084 


5870 


7B4CIP2BJL86 


5750 


: 513 


2299 


4085 


5871 


784CIP2B_187 


5761 


514 


2300 


4086 


5872 


7B4CIP2B_188 


5762 


515 


2301 


4087 


5873 


784CIP2B_189 


5767 


| 516 


2302 


4088 


5874 


784CIP2BJL90 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


784CIP2BJL92 


5784 1 


519 


2305 


4091 


5877 


784CIP2B_193 


5788 


520 


2306 


4092 


5878 


784CIP2B_194 


5798 


S 521 


2307 


4093 


5879 


784CIP2B_196 


5807 


522 


2308 


4094 


5880 


784CIP2B_197 


5818 


523 


2309 


4 095 


5881 


784CIP2BJL98 


5819 [ 


524 


2310 


4096 


5882 


784CIP2B_199 


5827 


525 


2311 


4097 


5883 


784CIP2B_200 


5828 


526 


2312 


4098 


5884 


784CIP2B_201 


5842 


j 527 


2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B_203 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5888 


784CIP2B_205 


5865 


531 


2317 


4103 


5889 


784CIP2B_206 


5871 


532 


2318 


4104 


5890 


784CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B_208 


5873 


534 


2320 


4106 


5892 


784CIP2B_209 


5875 


535 


2321 


4107 


5893 


784CIP2B_210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


537 


2323 


4109 


5895 


784CIP2B_212 


5880 


538 


2324 


4110 


5896 


784CIP2B_213 


5880 


539 


2325 


4111 


5897 


784CIP2B_214 


5880 


540 


2326 


4112 


5898 


784CIP2B_215 


5880 


541 


2327 


4113 


5899 


784CIP2B_216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 


784CIP2B_21B 


5898 


544 


2330 


4116 


5902 


784CIP2BJ219 


5902 


545 


2331 


4117 


5903 


784CIP2B_220 


5904 


546 


2332 


4118 


5904 


784CIP2B_221 


5918 


547 


2333 


4119 


5905 


784CIP2B_222 


5921 


548 


2334 


4120 


5906 


784CIP2B_223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 \ 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B_227 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 



556 



557 



558 



559 



560 



561 



562 



563 



564 



565 



566 



567 



568 
"569* 



570 



571 



572 



573 



574 



575 



576 



SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 



2342 



2343 



2344 



2345 



2346 



2347 



2348 



2349 



2350 



2351 



2352 



2353 



2354 
2355 



2356 



2357 



2358 



2359 



2360 



2361 



2362 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



4128 



4129 



4130 



4131 



4132 



4133 



4134 



4135 



4136 



4137 



4138 



4139 



4140 
4141 



4142 



4143 



4144 



4145 



4146 



4147 



4148 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



5914 



5915 



784CIP2B_232 
784CIP2B 233 



5916 



784CIP2B 234 



5917 



784CIP2B 235 



5918 



784CIP2B 236 



5919 



784CIP2B 237 



5920 



5921 



784CIP2B_238 
784CIP2B 239 



5922 



784CIP2B__240 



5923 



784CIP2B 241 



5924 



784CIP2B 242 



5925 



784CIP2B 



5926 
5927 



243 
"244" 



784CIP2B. 
784CIP2B 245 



5928 



784CIP2B 246 



5929 



784CIP2B 247 



5930 



784CIP2B 248 



5931 



784CIP2B 249 



5932 



784CIP2B 250 



5933 



5934 



784CIP2B_251 
784CIP2B 252 



5975 



5977 



5978 



5979 



5980 



5988 



5989 



5991 



5997 



5998 



6003 



6004 



6013 
6028 



6028 



6029 



6031 



6031 



6032 



6037 



6037 



577 



2363 



4149 



5935 



784CIP2B 253 



6043 



578 



579 



580 



2364 



2365 



2366 



4150 



5936 



784CIP2B 254 



4151 



5937 



784CIP2B 255 



4152 



5938 



784CIP2B 255 



6044 



6046 



6048 



581 



582 



583 



584 



2367 



4153 



5939 



784CIP2B 257 



2368 



4154 



5940 



784CIP2B 258 



2369 



4155 



5941 



784CIP2B 259 



2370 



4156 



5942 



784CIP2B 260 



6049 



6051 
6053 



6060 



585 



586 



587 
588 
"589 



590 



591 



592 



593 



594 



595 
"596" 



597 
598 



599 



Too 



601 



602 



603 



604 



605 



606 



607 
"608* 



609 



610 



611 



612 



613 



614 



615 



616 
"617 



2371 



4157 



5943 



784CIP2B 261 



2372 



4158 



5944 



784CIP2B 262 



2373 
2374 
2375 



4159 
4160 
4161 



5945 
5946 
5947 



784CIP2BJ263 
784CIP2B 264_ 
784CIP2B 265 



2376 



4162 



5948 



784CIP2B 266 



2377 



4163 



5949 



784CIP2B 267 



2378 



4164 



5950 



784CIP2B 268 



2379 



4165 



2380 



4166 



5951 
5952 



784CIP2B 269 



784CIP2B 270 



2381 
2382 



4167 
4168 



5953 
5954 



784CIP2B_272 
784CIP2B 273 



2383 
2384 



4169 
4170 



2385 



4171 



2386 



4172 



2387 



4173 



2388 



4174 



2389 



4175 



2390 



4176 



2391 



4177 



5955 
5956 



784CIP2B_274 
784CIP2B 275 



5957 



784CIP2B 276 



5958 



784CIP2B 277 



5959 



784CIP2B 278 



5960 



784CIP2B 279 



5961 



784CIP2B 280 



5962 



784CIP2B 281 



5963 



784CIP2B 282 



23 92 



4178 



2393 



4179 



2394 



4180 



2395 



4181 



2396 



4182 



2397 



2398 



2399 



2400 



2401 



2402 
2403 



4183 



4184 



4185 



4186 



4187 



4188 
4189 



5964 



784CIP2B 283 



5965 



784CIP2B 284 



5966 



784CIP2B 285 



5967 



784CIP2B 286 



5968 



784CIP2B 287 



5969 



784CIP2B 288 



5970 



784CIP2B 289 



5971 



784CIP2B 290 



5972 



784CIP2B 291 



5973 



784CIP2B 292 



5974 
5975 



784CIP2B_293 
^784CIP2B 294 



6063 



6066 



6067 
6068 
6073 



6076 



6076 



6077 



6079 



6082 



6088 



6091 
6094 



6101 



6103 



6104 
6108 



6112 



6121 



6125 



6126 
6128 



6129 



6133 



6133 
6135 



6139 



6141 



6145 



6146 



6148 



6149 
6149 
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SEO ID NO" 




SEO ID NO • 




4r J- Iwl JL t»jr 


OCW XL/ 


of full- 


NO: of 


of con tier 


NO* 


docket humher 


NO - in 


length 


full- 


nucleotide 


of con tier 


corresponding 


U S.S.N 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488. 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




618 


2404 


4190 


5976 


784CIP2B_295 


6153 


619 


2405 


4191 


5977 


784CIP2B 296 


6159 


620 


2406 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


5979 


784CIP2B_298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B_300 


6173 


624 


2410 


4196 


5982 


784CIP2B_301 


; 6190 


625 


2411 


| 4197 


5983 


784CIP2B_302 


6194 


626 


2412 


4198 


5984 


784CIP2B_303 


6196 


627 


2413 


4199 


5985 


784CIP2B_304 


6197 


628 


2414 


4200 


5986 


784CIP2B_305 


6198 


629 


2415 


4201 


5987 


784CIP2B_306 


6198 


630 


2416 


4202 


5988 


784CIP2B_3 08 


6214 


631 


2417 


4203 


5989 


784CIP2B_309 


6215 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


633 


2419 


4205 


5991 


784CIP2B_311 


6226 


634 


2420 


4206 


5992 


784CIP2B_312 


6229 


635 


2421 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


784CIP2B 314 


6237 


637 


2423 


4209 


5995 


784CIP2B_315 


6238 


63.8 


2424 


4210 


5996 


784CIP2B_316 


6239 


639 


2425 


4211 


5997 


784CIP2B_317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784CIP2B_319 


6240 


642 


2428 


4214 


6000 


784CIP2B_320 


6244 


643 


2429 


4215 


6001 


784CIP2B_321 


6245 


644 


2430 


4216 


6002 


•784CIP2B_322 


6250 


645 


2431 


4217 


6003 


784CIP2B_323 


6252 


646 


2432 


4218 


6004 


784CIP2B_324 


6252 


647 


2433 


4219 


6005 


784CIP2B_325 


6256 


648 


2434 


4220 


6006 


784CIP2B_326 


6260 


649 


2435 


4221 


6007 


784CIP2B_327 


6261 


650 


2436 


4222 


6008 


784CIP2B_328 


6264 


651 


2437 


4223 


6009 


784CIP2B_329 


6265 


652 


2438 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2B_331 


6270 


654 


2440 


4226 


6012 


784CIP2B_332 


6271 


655 


2441 


4227 


6013 


784.CIP2B_334 


6274 


656 


2442 


4228 


6014 


784CIP2B_335 


6276 


657 


2443 


4229 


6015 


784CIP2B_336 


6281 


658 | 


2444 


4230 


6016 


784CIP2B_337 


6281 


659 


2445 


4231 


6017 


784CIP2B_338 


6288 


660 


2446 


4232 


6018 


784CIP2B_339 


6292 


661 


2447 


4233 


6019 


784CIP2B_340 


6294 


662 


2448 


4234 


6020 


784CIP2B_343 


6312 


663 


2449 


4235 


6021 


784CIP2B_344 


6312 


664 


2450 


4236 


6022 


784CIP2B_345 


6312 


665 


2451 


4237 


6023 


784CIP2B_346 


6322 | 


666 


2452 


4238 


6024 


784CIP2B_347 


6324 ; 


667 


2453 


4239 


6025 


784CIP2B_349 


6329 | 


663 


2454 


4240 


6026 


784CIP2B_350 


6331 


669 


2455 


4241 


6027 


784CIP2B_351 


6333 


670 


2456 


4242 


6028 


784CIP2B_352 


6334 


671 


2457 


4243 


6029 


784CIP2B_353 


6337 


672 


2458 


4244 


6030 


784CIP2B_354 


6339 | 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B_356 


6348 


675 


2461 


4247 


6033 


784CIP2B_357 


6348 


676 


2462 


4248 


6034 


784CIP2B_358 


6350 


677 


2463 


4249 


6035 


784CIP2B_359 


6351 


678 


2464 


4250 


6036 


784CIP2B_360 


6355 


679 


2465 


4251 


6037 


784CIP2B 361 


6362 



281 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 



680 
~682 



683 

"6IT 

"685" 



686 



687 



688 
"689" 



SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 



2466 



2467 



2468 



2469 



2470 



2471 



2472 



2473 



2474 



2475 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 



42S2 



4253 



4254 



4255 



4256 



4257 



4258 



4259 



4260 



4261 



6038 



784CIP2B 362 



6039 



784CIP2B 363 



6040 



784CIP2B 364 



6041 



784CIP2B 365 



6042 



784CIP2B 366 



6043 



784CIP2B 367 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



6044 



784CIP2B 368 



6045 



784CIP2B 369 



6046 



784CIP2B 370 



6047 



784CIP2B 371 



6368 



6369 



6371 



6376 



6379 



6380 



63 81 



6392 



6395 



6397 
6400 



690 



691 



2476 



2477 



4262 



6048 



784CIP2B 372 



4263 



6049 



784CIP2B 373 



6401 
6411 



692 



693 
"694" 



2478 



2479 



2480 



4264 



6050 



784CIP2B 374 



4265 



6051 



784CIP2B 375 



4266 



6052 



784CIP2B 376 



6411 



6411 



695 



696 



697 



698 
"699" 



700 
"70T 



2481 



2482 



2483 



2484 



2485 



2436 



2487 



4267 



6053 



784CIP2B 377 



4268 



6054 



784CIP2B 378 



4269 



6055 



784CIP2B 379 



4270 



6056 



784CIP2B 380 



4271 



6057 



784CIP2B 381 



4272 



6058 



784CIP2B 382 



4273 



6059 



784CIP2B 383 



6416 



6418 



6422 



6423 
6426 



6427 
6428 



702 



2488 



4274 



6060 



784CIP2B 384 



6429 



703 
"704" 



705 



706 
"70T 



2489 



2490 



2491 



2492 



2493 



4275 



6061 



784CIP2B 385 



4276 



6062 



784CIP2B 386 



4277 



6063 



784CIP2B 387 



4278 



6064 



784CIP2B 388 



4279 



6065 



784CIP2B 389 



6430 



6432 



6432 



6438 



6441 
6446 



708 
"709* 



710 



711 



712 



713 



2494 



2495 



2496 



2497 



2498 



2499 



4280 



6066 



784CIP2B 390 



4281 



6067 



784CIP2B 391 



4282 



6068 



784CIP2B 392 



4283 



6069 



784CIP2B 394 



4284 



6070 



784(?iP2B 395 



4285 



6071 



784CIP2B 396 



6454 



6459 
6461 



6467 



6468 
6487 



714 



2500 



4286 



6072 



784CIP2B 397 



715 



2501 



4287 



6073 



784CIP2B 398 



6491 



716 



2502 



4288 



6074 



784CIP2B 399 



6506 
6514 



717 



2503 



4289 



6075 



784CIP2B 



401 
"402" 



718 



719 



2504 



2505 



4290 



6076 



784CIP2B 



4291 



6077 



784CIP2B 403 



6519 



6521 
6532 



720 



2506 



4292 



6078 



784CIP2B 404 



721 



2507 



4293 



6079 



784CIP2B 



405 
"406" 



6536 



722 



2508 



4294 



6080 



784CIP2B 



6543 
6544 



723 



724 



725 



726 



727 



2509 



2510 



2511 



2512 



2513 



4295 



6081 



784CIP2B 407 



4296 



6082 



4297 



6083 



4298 



6084 



4299 



6085 



784CIP2B 408 



784CIP2B 



784CIP2B 



409 
410 



784CIP2B 411 



654 8 



6551 



6551 



6552 



728 



2514 



4300 



6086 



784CIP2B 412 



6554 



729 



730 



2515 



2516 



4301 



6087 



784CIP2B 413 



4302 



6088 



784CIP2B 414 



6556 



6560 



731 



732 



733 



2517 



2518 



2519 



4303 



6089 



784CIP2B 415 



4304 



6090 



784CIP2B 416 



4305 



6091 



784CIP2B 417 



6563 



6564 
"6567 



734 



2520 



4306 



6092 



784CIP2B 418 



6573 



735 



2521 



4307 



6093 



784CIP2B 419 



6575 
6577 



736 



2522 



2523 



4308 



6094 



784CIP2B 420 



4309 



6095 



784CIP2B 421 



6593 
6595 



738 



739 



2524 



2525 



4310 



6096 



784CIP2B 422 



4311 



6097 



784CIP2B 423 



6599 
6625 



740 



741 



2526 



2527 



4312 



6098 



784CIP2B 424 



4313 



6099 



784CIP2B 425 



6625 



282 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 

of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


SEQ ID 

NO: 


Priority 
docket number^ 


SEQ ID 
NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




742 


2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


j 6101 


784CIP2B_427 


6630 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


6632 


746 


2532 


4318 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


! 784CIP2B__433 


6641 


750 


2536 


4322 


6108 


784CIP2B_434 


6644 


751 


2537 


4323 


| 6109 


784CIP2B_435 


6646 


752 


253B 


4324 


6110 


784CIP2B_436 


6648 


753 


2539 


4325 


6111 


784CIP2B_437 


6652 


754 


2540 


4326 


6112 


784CIP2B_438 


6654 


755 


2541 


4327 


6113 


784CIP2B_439 


6657 


756 


2542 


4328 


6114 


784CIP2B_440 


6658 


757 


2543 


4329 


6115 


784CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B_442 


6664 


•759 


2545 


4331 


6117 


784CIP2B 443 


6668 


760 


2546 


4332 


6118 


784CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B_445 


6673 


762 


2548 


4334 


6120 


784CIP2B 446 


6685 


763 


2549 


4335 


6121 


784CIP2B_447 


6687 


764 


2550 


4336 


6122 


784CIP2B_448 


6689 


765 


2551 


4337 


6123 


784CIP2B_449 


6693 


766 


2552 


4338 


6124 


784CIP2B_450 


6698 


767 


2553 


4339 


6125 


784CIP2B 451 


6699 


768 


2554 


4340 


6126 


784CIP2B_452 


6705 


j 769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B 454 


67,13 


771 


2557 


4343 


6129 


784CIP2B_455 


6716 


772 


2558 


4344 


6130 


784CIP2B_456 


6725 


773 


2559 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B_:458 


6727 


775 


2561 


4347 


6133 


784CIP2B_459 


6730 


776 


2562 


4348 


6134 


784CIP2B_460 


6730 


777 


2563 


4349 


6135 


784CIP2B_46l 


6730 


778 


2564 


4350 


5136 


784CIP2B_462 


6732 


. 779 


2565 


4351 


6137 


784CIP2B_463 


6733 


780 


2566 


4352 


6138 


784CIP2B_4 64 


673 7 


781 


2567 


4353 


6139 


784CIP2B_465 


6745 


782 


2568 


4354 


6140 


784CIP2B_4 66 


6751 


783 


2569 


4355 


6141 


784CIP2B_467 


6754 


784 


2570 


4356 


6142 


784CIP2B_468 


6758 


785 


2571 


4357 


6143 


784CIP2B_469 


6761 


786 


2572 


4358 


6144 


784CIP2B_470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


6768 


788 


2574 


4360 


6146 


784CIP2B_472 


6773 


789 


2575 


4361 


6147 


784CIP2B_4 73 


6776 


790 


2576 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


6798 


792 


.2578 


4364 


6150 


784CIP2B_476 


6823 


793 


2579 


4365 


6151 


784CIP2B_477 


6825 


794 


2580 


4366 


6152 


784CIP2B 478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


2582 


4368 


6154 


784CIP2B 480 


6844 


797 


2583 


4369 


6155 


784CIP2B_482 


6849 


798 


2584 


4370 


6156 


784CIP2B 483 


6854 


799 


2585 


4371 


6157 


784CIP2B 484 


6857 


800 


2586 


4372 


6158 1 


784CIP2B_48S 


6861 


801 


2587 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


6161 


784CIP2B_488 


6877 
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"SfeQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID | 

NO: of 

full- 

length 

peptide 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

-C mm *m*.mm. mm *T 

ot contig 

peptide 

sequence; 


Priority 1 
docket number^ 
corresponding | 
SEQ ID NO: in j 
priority j 
application j 


NO: in 
U.S. S.N. 
09/488,725 


804 


2590 


ft 3 / b 




784CIP2B_489 j 


6880 


805 


2591 


43 / / 


DID J 


784CIP2B_490 j 6885 


806 


ft r ft ft] 

2592 


4 o /o 


Oiot 


784CIP2B_491 j 6890 


807 


ft ^ ft ^ 

2593 




0103 


784CIP2B_492 j 6890 


808 


2594 


a i q n 


O XD D 


784CIP2B_493 j 6894 


i 809 


r» r A r 

2595 


>1 "3 0 1 


OJ.O / 


784CIP2B 494 6901 


] 810 


ft I* ft. ^ 

2596 




OlDO 


/ 04l*.lxrZD__ffc ^Ij 




! 811 


ft r* ft *7 

2597 


4383 


DID J 




o j yj ' 


| 812 


2598 


A 1 Q A 

4384 


CI TO 




0 7 X x 


| 813 


2599 


4385 




784CIP2B_498 j 


6917 


| 814 


2600 


A 1 O ^ 

4385 




784CIP2B_499 | 


6923 


| 815 


2601 


4387 


CI "7^ 
Ol /O 


784CIP2B_500 | 


6929 


816 


2602 


4388 


CI HA 
Dl /4 


784CIP2B_501 | 


6931 


| • 817 


2603 


43 S3 


Dl / 3 


784CIP2B_502 | 


6935 


} 818 


2604 


4390 


CI "7C 
Ol /O 


784CIP2B~503 t 


6940 


j 819 


2605 


4391 


c i in 

Ol / ' 


784CIP2B__504 | 


6945 


j 820 


2606 


Jl ft; 

4392 




784CIP2B_505 j 


6946 


( 821 


2607 


A 1 Ol 

4393 


Ol / J 


784CIP2B_506 j 


6947 


1 B22 


2608 


4394 


CI Oft 

0 10 u 


784CIP2B_507 j 


6949 


j 823 


% 2609 


4395 


C 1 ft 1 
0101 


784CIP2B_508 | 


6959 


j 824 


2610 


4396 • 


C 1 BO 
010/ 


784CIP2B_509 ! 


6960 


\ 825 


2611 


439 / 


ci 


784CIP2B__510 | 


6962 


1 826 


2612 


4398 


C1 Q.A 


784CIP2B_511 j 


6963 


1" 827 


2613 


i "J OQ 

4399 


cine; 

O ID 3 


784CIP2B_512 ] 


6967 


j 828 


2614 


4400 


DlOO 


784CIP2B_513 ] 


6983 


[~ 829 


2615 


4401 


CT QT 
old / 


784CIP2B_514 | 


6988 


j 830 


2616 


4402 


C 1 Q ft 


784CIP2B__515 ) 


6996 


j 831 


2617 


4403 


c 1 n 4 

D ID -7 


784CIP2B_516 | 7003 


j 832 


2618 


Jl A ft A 

4404 


ci an 
Dl?U 


784CIP2B_517 1 


701b 


I 833 


2619 


Jl A ft C 

4405 


CI Q1 

oiyi 


784CIP2B__518 


7017 


j 834 


2620 


A A f\C 

4406 


CI QO 


784CIP2B_519 


7025 


| 835 


2621 


4407 


CI Q 1 

0 iy j 


784CIP2B_520 


7025 


j 836 


2622 


a a n o 

4408 


CI OA 
O 1274 


784CIP2B_521 


7025 


j 837 


2623 


A. A ft fti 

4409 


CI QC 


784CIP2B_522 


j 7050 


j 838 


2624 


4410 


C"n qc 
oiy b 


784CIP2B_523 


7051 


1 839 


2625 


4411 


61? / 


784CIP2B_524 


7055 


j 840 


2626 


Jl Jt ^ ft 

4412 




784CIP2B__525 


| 7060 


f 841 


2627 


4413 


CI QQ 

oil? y 


784CIP2B_526 


! 7064 


j 842 


2628 


4414 


conn 


784CIP2B_527 


7067 


! 843 


2629 


441b 


CO A1 


784CIP2B 528 1 70/1 


i 844 


2630 


a x i c 
4416 




784CIP2B_529 


j 7072 


j 845 


2631 


A >i n i 

4417 


C9n^ 


784CIP2B_530 


7073 


j 846 


2632 


4418 


D Z U*± 


784CIP2B_531 


7076 


i 847 


2633 


A A 1 Q 

44iy 


cone 

D Z U -> 


784CIP2B_532 


7088 


J 848 


2634 


a a d n 




784CIP2B 533 


7089 


j 849 


^ l~ o c 

2635 




C907 

Daw/ 


784CIP2B_534 


7091 


1 a r~ f\ 

| 850 


2636 




C9flft 


784CIP2B_535 


7091 


i 85 i 


2637 


44*6-5 


D Z w J 


784CIP2B_536 


7104 


j 852 


ft "5 f> 

2638 


A il O /I 

44^4 


C91 0 


784CIP2B_537 


7105 


| 853 


2639 


A A O C 


£91 T 


784CIP2B_538 


7105 


! 8 ^4 


2540 


442o 


C9 1 9 


784CIP2B_539 


7109 


I 855 


2541 


a /i on 
44^ / 


^" COT "1 


784CIP2B_540 


7109 


| 856 


2642 


4426 




784CIP2B_541 


7119 


j 857 


ft ft 

2643 


4429 


cni c 


784CIP2B_542 


7120 


j 858 


2644 


4430 


COT C 


784CIP2B_543 


7121 


QCQ 
1 ^ 


2645 


4431 


6217 


784CIP2B_544 


7126 


| 860 


2646 


4432 


6218 


784CIP2B_545 


7127 


! B61 


2647 


4433 


6219 


784CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B__547 


7131 


| 863 


2649 


4435 


6221 


784CIP2B_548 


7144 


[ 864 


2650 


4436 


6222 


784CIP2B_549 


7159 


j 865 


2651 


4437 


6223 


7B4CIP2B_550 


7163 
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SEQ ID NO : 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 


OI Iull- 


JNU : or 


or conuig 


iMU : 


docket number 


NO : 1A 


Xcuy uii 


fill 1 


nucicotiut; 


ox concxg 


corxesponQiny 


n o e kt 


nur*! pot* i He* 


4vlJ>H 1.11 






QRO TO NO • \ n 

O Ij V i-U IM\J . Xll 


HQ / A ft ft 79 ^ 

\J if / t O O r / fa D 


eecfuence 

kp* \A W* X* * * ^* 


secrxience 






irriori fcv 
atDol. ication 

b#j hrf 7 ^» «A V* ^^4k4» 




866 


2652 


4438 


6224 


784CIP2B 551 


7175 

• *4b 4 vr 


867 


2653 


4439 


6225 
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784CIP2B_797 


8019 


1109 


2895 


4681 


6467 


784CIP2B_798 


8020 


1110 


2896 


4682 


6468 


784CIP2B_799 


8022 


1111 


2897 


4683 


6469 


784CIP2B_800 


8 022 


1112 


2898 


j 4684 


6470 


784CIP2B_801 


8028 


1113 


2899 


4685 


6471 


784CIP2B_802 


8030 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1114 


2900 


4686 


6472 


784CIP2B_803 


8038 


1115 


2901 


4687 


6473 


784CIP2B_804 


8042 


1116 


2902 


4688 


6474 


784CIP2B_805 


8045 


1117 


2903 


4689 


6475 


784CIP2B_806 


8045 


1118 


2904 


4690 


6476 


784CIP2B_807 


8046 


1119 


2905 


4691 


6477 


784CIP2B_808 


8047 


1120 


2906 


4692 


6478 


784CIP2B_809 


8051 


1121 


2907 


4693 


6479 


784CIP2B_810 


8059 


1122 


2908 


4694 


6480 


784CIP2B_811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


6482 


784CIP2B_813 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 


8077 


1126 


2912 


4698 


6484 


784CIP2B_815 


8078 


1127 


2913 


4699 


6485 


784CIP2B_816 


8079 


1128 


2914 


4700 


6486 


7B4CIP2B_817 


8084 


1129 


2915 


4701 


6487 


7B4CIP2B_818 


8088 


1130 


2916 


! 4702 


6488 


784CIP2B_819 


8090 


1131 


2917 


4703 


6489 


784CIP2B_820 


8091 


1132 


2918 


4704 


6490 


784CIP2B_821 


8099 


1133 


2919 


4705 


6491 


784CIP2B_822 


8099 


1134 


2920 


4706 


6492 


784CIP2B_823 


8100 


1135 


2921 


4707 


j 6493 


784CIP2B 824 


8102 


1136 


2922 


4708 


6494 


784CIP2B_825 


8103 


1137 


2923 


4709 


6495 


784CIP2B_826 


8103 


1138 


2924 


4710 


6496 


784CIP2B_827 


8104 


1139 


2925 


4711 


6497 


784CIP2B 828 


8108 


1140 


2926 


4712 


6498 


784CIP2B 829 


8110 


1141 


2927 


4713 


6499 


784CIP2B_830 


8116 


1142 


2928 


4714 


6500 


784CIP2B_831 


8117 


1143 


2929 


4 715 


6501 


784CIP2B_832 


8123 


1144 


2930 


4716 


6502 


784CIP2B_833 


8130 


1145 


2931 


4717 


6503 


784CIP2B_834 


8130 


1146 


2932 


4718 


6504 


784CIP2B_B35 


Sl43 


1147 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B_837 


8154 


1149 


2935 


4721 


6507 


784CIP2B_838 


8155 


1150 


2936 


4722 


6508 


784CIP2B_839 


8162 


1151 


. 2937 


4723 


6509 


784CIP2B_840 


8163 


1152 


2938 


4724 


6510 


784CIP2B_841 


8172 


1153 


2939 


4725 


6511 


784CIP2B_842 


8173 


1154 


2940 


4726 


6512 


784CIP2B_843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 


8182 


1156 


2942 


472B 


6514 


784CIP2B_845 


8183 


1157 


2943 


4729 


6515 


784CIP2B_846 


8184 


1158 | 


2944 


4730 


6516 


784CIP2B_B47 


8185 


1159 


2945 


4731 


6517 


784CIP2B_848 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B_850 


8190 


1162 


2948 


4734 


6520 


784CIP2B_851 


8190 


1163 


2949 


4735 


6521 


784CIP2B 852 


8192 


1164 


2950 


4736 


6522 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784CIP2B_854 


8197 


1166 


2952 


4738 


6524 


784CIP2B_855 


8197 


1167 


2953 


4739 


6525 


784CIP2B_856 


8199 


1168 


2954 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


784CIP2B_859 


8208 


1171 


2957 


4 743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 j 


4744 


6530 


784CIP2B_861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 1 


1174 


2960 \ 


4746 


6532 


784CIP2B 863 


8217 


1175 


2961 


4747 


6533 


784CIP2B_864 


8223 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 



1176 



1177 



1178 



1179 



1180 



1181 



1182 



SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 



2962 



2963 



2964 



2965 



2966 



2967 



2968 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



Priority 
docket number_ 
corre sponding 
SEQ ID NO: in 
priority 
appl ication 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



4748 



6534 



784CIP2B B65 



4749 



6535 



784CIP2B 366 



4750 



6536 



784CIP2B 867 



4751 



6537 



784CIP2B 868 



4752 



6538 



784CIP2B 869 



4753 



6539 



784CIP2B 870 



4754 



6540 



784CIP2B_871 
784CIP2B 872 



8224 
8226 



8227 
8229 



8232 



8236 



8239 
8244 



1183 



1184 



1185 



1186 



1187 



1188 



2969 



2970 



2971 



2972 



2973 



2974 



4755 



6541 



4756 



6542 



784CIP2B 873 



4757 



6543 



784CIP2B 874 



4758 



6544 



784CIP2B 875 



4759 



6545 



784CIP2B 876 



4760 



6546 



784CIP2B__877 
784CIP2B 878 



8245 
8248 



8251 



8253 



8260 
8262 



1189 
1190 



1191 



1192 



1193 



1194 



2975 
2976 



2977 



2978 



2979 



2980 



4761 
4762 



6547 
6548 



784CIP2B 879 



4763 



6549 



784CIP2B 8B0 



4764 



6550 



784CIP2B 881 



4765 



6551 



784CIP2B 882 



4766 



6552 



784CIP2B 883 



8268 



8270 
8272 



8274 
8274 



1195 j 


2981 


A '"I CI 1 

4 f of 1 


ccn 1 
a j j J 1 


784CIP2B 884 

f V * ^m* mm. Am U WW* 


8275 ] 


1196 | 


2982 


A "7 C Q 1 

4 / bo 1 




784CIP2B 885 

t \J * \m* mlm Sm mm mm* W W M» 


8277 


1197 j" 


2983 


A "7 £T Q S 

4 / ©y j 


C C c c 
O JJJ 


784CIP2B 8B6 


8281 1 


1198 J 


2984 


4 1 /U 




784CIP2B 887 


8283 j 


1199 j 


2985 


a n *7 i $ 
4 / / 1 I 




784CIP2B 888 


8289 1 


1200 j 


2986 


4 / / Z I 


U J JO 1 


784CIP2B 889 


8295 


1201 J 


*V r*k ^ 

2987 


A "7 *"I ^ I 

4773 


0 D^JZf I 


784CIP2B 890 


8300 J 
8303 1 


1202 j 


2988 


A 1 T A ) 

4 / /4 


gOOU | 


784CIP2B 891 




1203 j 


2989 


A TIC 1 

4 / / b 




784CIP2B 892 


8304 | 


1204 j 


k Ann 

2990 


ATI C \ 
4 1 to 




784CIP2B 893 1 


8305 


1205 ( 


2991 


A T 1 H i 

4 f f f 


D JO J 


784CIP2B 894 


8309 1 


1206 | 


2992 


4778 | 


6564 


/ o4L.XirZc o3D 


8318 1 


1207 j 


2993 


4779 j 


6565 


784CIP2B_895 


8319 | 


1208 1 


2994 


4780 


6566 ] 


784CIP2B_897 


8321 j 


1209 | 


2995 


4781 


6567 j 


784CIP2B_898 


8322 [ 


1210 | 


2996 


4782 j 


6568 


784CIP2B_899 


8323 | 


1211 


2997 


4783 J 


6569 j 


784CIP2B 900 


8325 1 
8331 1 


1212 | 


2998 


4784 j 


6570 


784CIP2B_901 




1213 ! 


2999 


4785 j 


6571 


784CIP2B_902 


8332 j 


1214 


3000 


4786 j 


6572 


784CIP2B_903 


8333 j 


1215 


3001 


4787 j 


6573 


784CIP2B_904 


8335 | 


1216 


3002 


4788 


6574 


784CIP2B_905 


8336 j 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 ] 
8340 1 


1218 


1 3004 


4790 


6576 


784CIP2B_907 




1219 


3005 


4791 


6577 


784CIP2B 908 


8343 ] 


1220 


3006 


4792 


6578 


784CIP2B_909 


8347 | 


1221 


j 3007 


4793 


| 6579 


784CIP2B_910 


8349 | 


1222 


j 3008 


4794 


6580 


784CIP2B_911 


8351 , 


1223 


j 3009 


4795 


[ 6581 


784CIP2B_912 


8353 


1224 


j 3010 


4796 


| 6582 


784CIP2B 913 


8355 


1225 


) 3011 


4797 


j 6583 


784CIP2B_914 


8361 


1226 


j 3012 


4798 


! 6584 


784CIP2B_915 


8365 


1227 


| 3013 


4799 


| 6585 


784CIP2B_916 


8367 


1228 


1 3014 


4800 


1 6586 


784CIP2B_917 


8369 


1229 


| 3015 


4801 


j 6587 


784CIP2B__919 


8375 


1230 


3016 


4802 


| 6588 


784CIP2B_920 


8387 


1231 


j 3017 


4803 


[ . 6589 


784CIP2B_921 


8391 


1232 


j 3018 


4804 


j 6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


j 6591 


784CIP2B_923 


8393 ™ 


1234 


[ 3020 


4806 


i 6592 


784CIP2B_924 


8394 


1235 


3021 


i 4807 


j 6593 


784CIP2B_925 


8395 


1236 


j 3022 


4808 


| 6594 


784CIP2B 926 


8396 


1237 


I 3023 


4809 


i 6595 


784CIP2B_927 


8398 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1238 


3024 


4810 


6596 


( 784CIP2B_928 


8402 


1239 


3025 


4811 


6597 


784CIP2B_929 


8402 


1240 


3026 


4812 


6598 


784CIP2B_930 


8405 


1241 


3027 


4813 


6599 


784CIP2B_931 


8406 


1242 


3028 


4814 


6600 


j 784CIP2B_932 


8409 


1243 


3029 


4815 


6601 


| 784CIP2B_933 


8410 


1244 


3030 


4816 


6602 


784CIP2B_934 


8414 . 


1245 


3031 

• 


4817 


6603 


784CIP2B_935 


8415 


1246 


3032 


4818 


6604 


784CIP2B_936 


8419 


1247 


3033 


4819 


6605 


784CIP2B_937 


8426 


j 1248 


3034 


4820 


6606 


784CIP2B_938 


8430 


1249 


3035 


4821 


6607 


784CIP2B_939 


8431 


1250 


3036 


4822 


6608 


784CIP2B_940 


8432 


1251 


3037 


4823 


6609 


784CIP2B_941 


8433 


1252 


3038 


4B24 


6610 


784CIP2B_942 


8434 


1253 


3039 


4825 


6611 


784CIP2B_943 


8438 


1254 


3040 


4826 


6612 


784CIP2B_944 


8439 ! 


1255 


3041 


4827 


6613 


784CIP2B_945 


8441 


1256 


3042 


4828 


6614 


784CIP2B_946 


8450 | 


1257 


3043 


4829 


6615 


784CIP2B_947 


8451 


1258 


3044 


4830 


6616 


784CIP2B_948 


| 8452 


1259 


3045 


4 831 


6617 


784CIP2B_949 


8460 


1260 


3046 


4832 


6618 


784CIP2B_950 


8461 


1261 


3047 


4833 


6619 


784CIP2B_951 


8462 


1262 


3048 


4834 


6620 


784CIP2B_952 


8464 ' 


1263 


3049 


4835 


6621 


784CIP2B_953 


8465 


1264 


3050 


4836 


6622 


784CIP2B_954 


8467 


1265 


3051 


4837 


6623 


784CIP2B_955 


8470 


1266 


3052 


4838 


6624 


784CIP2B_956 


8471 


1267 


3053 


4839 


6625 


784CIP2B_957 


8473 


1268 


3054 


4840 


6626 


784CIP2B_958 


8474 | 


1269 


3055 


4841 


6627 


784CIP2B_959 


8475 


1270 


3056 


4842 


6628 


784CIP2B 960 


8476 


1271 


3057 


4843 


6629 


784CIP2B_961 


8480 


1272 


3058 


4844 


6630 


784CIP2B_962 


8482 


1273 


3059 


4845 


6631 


784CIP2B 963 


8482 


1274 


3060 


4846 


6632 


784CIP2B 964 


8486 


1275 


3061 


4847 


6633 


784CIP2B_965 


8488 


1276 


3062 


4848 


6634 


784CIP2B_966 


8492 


1277 | 


3063 


4849 


6635 


784CIP2B_967 


8494 | 


1278 


3064 


4850 


6636 


784CIP2B_96B 


8496 


1279 


3065 


4851 


6637 


784CIP2B__969 


8497 


1280 . 


3066 


4852 


6638 


784CIP2B_970 


8499 


1281 


3067 


4853 


6639 


784CIP2B_971 


8513 


1282 


3068 


4854 


6640 


784CIP2B_972 


8522 


1283 


3069 


4855 


6641 


784CIP2B_973 


8526 


1284 


3070 


4856 


6642 


784CIP2B_974 


8531 


1285 


3071 


4857 


6643 


784CIP2B_975 


8533 


1286 


3072 


4858 


6644 


784CIP2B_976 


8542 


1287 


3073 


4859 


6645 


784CIP2B_977 


8544 


1288 


3074 


4860 


6646 


784CIP2B_978 


8565 


1289 | 


3075 


4861 


6647 


784CIP2B_979 


8565 


1290 


3076 


4862 


6648 


784CIP2B_980 


8572 


1291 


3077 


4863 


6649 


784CIP2B_981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_983 


8584 


1294 


3080 


4866 


6652 


784CIP2B_984 


8598 


1295 


3081 


4867 


6653 


784CIP2B_985 


8602 


1296 


3082 


4868 


6654 


784CIP2B_986 


8604 


1297 


3083 | 


4869 


6655 


784CIP2B_987 


8609 


1298 


3084 


4870 


6656 


784CIP2B_988 


8612 


12?9 


3085 


4871 


6657 


784CIP2B_989 


8637 
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SEQ ID NO: 
or tun- 


SEQ ID 
jno : or 


bby ID NO: 
ox. conuxy 


0£>Q 1U 

..NSJ . 


riiui x uy 


5F(1 Tf) 1 
NO • in i 

v 9 fill | 


xengun 


■Fill 1 - 
xuxx — 


11 LLV LCU U1UC 


O'F r»oiih *i cr 


co r t& s Don di ncr 


U.S. S.N. j 


riu c icoi iuc 






"D&Dt ids 

^tr^mm^tM Km mm* %^^m» 


SEQ ID NO: in 


09/488,725 




T1^T\t~ n r^o 




seauence 

**m* m *m— ^mm "*m^ m- -m> ^m* 


priority 
application 




1300 


3086 


4872 


6658 


784CIP2B_990 


8640 i 


1301 

X J X 


3087 

V W 9 


4873 


6659 


784CIP2B_991 


8643 j 




3 088 

■mj WWW | 


4874 


6660 


784CIP2B_992 


8645 j 


1303 

x -j u — i 


3089 


4875 


6661 


784CIP2B_993 


8650 1 


1304 

X_? L/Tt 


3090 


4876 


6662 


7S4CIP2B_994 J 


8651 1 


1 3 05 

X -J Ui3 


3 091 


4877 


6663 | 


784CIP2B_995 | 


8654 




3092 


4878 


6664 


784CIP2B_996 | 


8655 


1 3 07 

X J u / 


3093 


4879 


6665 


784CIP2B_997 j 


8657 j 


1 3 OR 


3 094 


4880 


6666 


784CIP2B_998 | 


8665 | 


1 7 HQ 


3095 


4881 


6667 


784CIP2B_999 | 


8668 | 


i 3 1 n 

X J lw 


3 096 


4882 

W W 4m* 


6668 


784CIP2B_1000 ] 


8671 j 


iJll 


3097 


4883 


6669 


784CIP2B_1001 j 


8672 


X<3 X*£ 


3 098 


4884 


6670 


784CIP2B_1002 j 


8692 | 


i 7 1 7 

X •> X J 


J U ^ J 


4885 

^ W W *m* 


6671 


784CIP2B_1003 j 


8706 j 


1 4 
IJ li 


3 100 

W X V V/ 


4886 

www 


6672 


7B4CIP2B_1004 | 


8716 | 


1 11 c 
X J X 3 


3101 

*J X w X 


4887 


6673 


784CIP2B_1Q05 j 


8719 j 


1 71 


3 102 


4888 

X w w w 


6674 


784CIP2B_1006 | 


8743 j 


1j X / 


3107 


4889 

* W W *r 


6675 


784CIP2B 1007 


8764 j 


X -> X o 


3104 

-J -1- VI ^ 


4890 


6676 


784CIP2B_1008 j 


8764 


i n q 

X J x 


3105 

-J x v/ 


4891 


6677 


784CIP2B_1009 | 


8764 j 


i i9n 


3106 

w ^ w w 


4892 


6678 


784CIP2B_1010 1 


8774 




3107 


4893 


6679 


784CIP2B_1011 1 


8782 i 


1399 


3108 


4894 


6680 


784CIP2B_1012 j 


8796 


13 93 


3109 


4895 


6681 


784CIP2B_1013 


8827 


1 3 94 


3110 


4896 


6682 


784CIP2B_1014 


8842 


1 IOC 


3111 

._> X X X 


4897 


6683 


784CIP2B_1015 


8842 




3112 

~3 X X 


4898 

* w ^ w 


6684 


784CIP2B_1016 


8858 


17 97 


J X X J 


4899 


6685 


784CIP2B_1017 


8B71 


1 7 9 ft 


3114 

O X X "* 


4900 

w w 


6686 


784CIP2B__1018 


, 8921 


1 7 9 Q 


311 5 

— > X X 


4901 


6687 


784CIP2B_1019 


j 8927 


i 7 7 n 


311 iS 


4902 


6688 


784CIP2B_1020 


] 8942 


17 7 1 


3117 
_> x .i / 


4903 


6689 

W W w m* 


784CIP2B_1021 


| 8994 


1 TOT 
1 J J Z 


J X X o 


4 904 


6690 


784CIP2B_1022 


j 9023 


1 777 


311 9 

J X X J 


4905 


6691 


784CIP2B_1023 


j 9028 


17 74 


3120 


4906 

~ ^ w w 


6692 

\m W ^ WZm 


784CIP2B_1024 


j 9058 


1 -ioc 
X J J D 


3121 


4907 


6693 


784CIP2B__1025 


j 9058 


1 7 7 C 


31 29 


4908 


6694 


784CIP2B 1026 


1 9079 


1 7 7 7 


31 23 

o X« J 


4909 

*X *J J0 


6695 

\tf mm* 


784CIP2B 1027 


| 9079 


1 7 7 Pt 


3124 


4910 


6696 


784CIP2B_1028 


j 9082 


17 7 9 
X J j 7 


31 25 

J X <C -J 


4911 


6697 


784CIP2B_1029 


1 9084 


i 34 n 


3126 


4912 


6698 


784CIP2B_1030 


| 9093 


1741 


3127 


4913 


6699 


784CIP2B_1031 


j 9101 


1349 


3128 

«J <e* W 


4914 


6700 


784CIP2B_1032 




1343 


3129 


4915 


6701 


784CIP2B_1033 


j 9105 


1344 

X«J TX 


313G 


4916 


6702 


784CIP2B_1034 


J 9151 


134*5 
x o *z j 


3131 

«Jx- <wJ ^ 


4917 


6703 


784CIP2B_1035 


j 9161 


1346 

mmt mJ *3 W 


3132 


4918 


6704 


784CIP2B_1036 


1 9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


| 9174 


134 8 


3134 


4920 


6706 


784CIP2B_1038 


j 9204 


1349 


3135 


4921 


6707 


784CIP2B - 1039 


j 9234 


1350 


3136 


4922 


6708 


784CIP2B_1040 


| 9235 


13m 

X J 3 X 


3137 


4923 


6709 


784CIP2B_1041 


| 9239 


1 359 


3138 


4924 


6710 


784CIP2B_1042 


1 9256 


1357 


3 139 


4925 

■A- mm* mm mm* 


6711 


784CIP2B_1043 


| 9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


f 9345 


1355 


3141 


4927 


6713 


784CIP2B_1045 


r 9379 


1356 


3142 


4928 


6714 


784CIP2B_1046 


1 9435 


1357 


3143 


4929 


6715 


784CIP2B_1047 


1 9437 


1358 


3144 


4930 


6716 


784CIP2B__104 8 


1 9469 


1359 


3145 


4931 


6717 


784CIP2B_1049 


j 9500 


1360 


3146 


4932 


6718 


784CIP2B__1050 


j 9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


1 9520 
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own Tn wn . 

OaU XL) 1NSJ > 




otto Tn mo. 


bEQ ID 


Priority 


SEQ ID 


of full- 


wo • -f 


Lsl. COUUiLg 


invj : 


uuCKet. nuTuoer 


1VTO . 4 n 

jmu : in 


1 encrth. 


full- 

A* tlJ- JU 


micl ^ot i dp 

ilU^ *J. ^ \h# V* -If U k— 


of rftnh i o 




TT Q Q W 


nucleotide 


1 encrth 


secmence 


Dent ide 


SEO ID NO- in 


w J/ lOO/ < t J 


sequence 


peptide 




secruence 


Driori tv 






sequence 






application 




1362 


3148 


4934 


6720 


784CIP2B_1052 


9541 


1363 


3149 


4935 


6721 


784CIP2B 1053 


9541 

W A. A* 


1364 


3150 


4936 


6722 


784CIP2B 1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B — 1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9575 


1368 


3154 


4940 


6726 


784CIP2B 1058 

W \f A> ^A"^» A« ^A 


9589 


1369 


3155 


4941 


6727 


784CIP2B 1059 


9599 


1370 


3156 


4942 


6728 


784CIP2B 1060 


9602 

«7 W W AA 


1371 


3157 


4943 


6729 


784CIP2B 1061 


9606 

•/ w V/ w 


1372 


3158 


4944 


6730 


784CIP2B 1062 


9622 

W AA AA 


1373 


3159 


4945 


6731 


784CIP2B 1063 

' \J A *- — *A» X. A_a " * «v> %^ w 


9623 

W A> *A* 


1374 


3150 


4946 


6732 


784CIP2B 1064 


9646 


1375 


3161 


4947 


6733 


784CIP2B 1065 


9747 


1376 


3162 


4948 


6734 


784CIP2B 1066 


9773 


1377 


3163 


4949 


6735 


784CIP2B 1067 


9785 


1378 


3164 


4950 


6736 


784CIP2B 1068 


9801 

W W AV 


1379 


3165 


4951 


6737 


784CIP2B 1069 


9811 

<S KJ JU JU 


1380 


3166 


4952 


6738 


784CIP2B 1070 


9843 


1381 


3167 


'4953 


6739 


784CIP2B 1071 


9854 


1382 


3168 


4954 


6740 


784CIP2B 1072 


9854 


13 83 


3169 


4955 


6741 


784CIP2B 1073 


9864 


1384 


3170 


4956 


6742 


784CIP2B 1074 


9864 


1385 


3171 


4957 


6743 


784CIP2B 1075 


9871 


1386 


3172 


4958 


6744 


784CIP2B 1076 


9879 


1387 


3173 


4959 


6745 


784CIP2B 1077 


9881 

W W *A^ 


1388 


3174 


4960 


6746 


784CIP2B 1078 


9885 


1389 


3175 


4961 


6747 


784CIP2B 1079 


9901 

•A* *J w JW 


1390 


3176 


4962 


6748 


784CIP2B 1080 


9912 

— ' «#f «A» A> 


1391 


3177 


4963 


6749 


784CIP2B 1081 


991^ 


1392 


3178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B 1083 

r \J A «A> A7 A> LV *Jk V W 


9925 


1394 


3180 


4966 


6752 

W * W AA 


784CIP2B 1084 


9930 


1395 


3181 


4967 


6753 


784CIP2B 1085 


9949 


1396 


3182 


4968 


6754 


784CIP2B 1086 

* A aAw At AA Am# «A> W \J 


9951 ! 

wf' «V W ! 


1397 


3183 


4969 


6755 


784CIP2B 1087 


9959 


1398 


3184 


4970 


6756 


784CIP2B 1088 

» W a «Ab Aj £■ w «Ai Www 


9973 


1399 


3185 


4971 


6757 


784CIP2B 1089 

9 \J A Ai A> A> 1 ^ V/ ^ 


9982 


1400 


3186 


4972 

< 


6758 


784CIP2B 1090 


9994 


1401 


3187 


4973 


6759 


784CIP2B 1091 


10021 


1402 


3188 


4974 


6760 


784CIP2B 1092 


10041 

W W Aj -A- 


1403 


3189 


4975 


6761 


784CIP2B 1094 


10067 


1404 


3190 


4976 


6762 


784CIP2B_1095 


10073 


1405 


3191 


4977 


6763 


784CIP2B_1096 


10X12 


1406 


3192 


4978 


6764 


784CIP2B_1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B_1098 


10132 


1408 


3194 


4980 


6766 


784CIP2B_1099 


10169 


1409 


3195 


4981 


6767 


784CIP2B_1100 


10217 


1410 


3196 


4982 


6768 


784CIP2B_1101 


10226 


1411 


3197 


4983 


6769 


784CIP2B_1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 


4985 


6771 


784CIP2B 1104 


10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 


1415 


3201 


4987 


6773 


784CIP2C 2 


271 


| 1416 


3202 


4988 


6774 


784CIP2C 3 

* W AI *A» Aa A%« VA> 


848 

v w 


! 1417 


3203 


4989 


6775 


784CIP2C_4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 


784CIP2C__10 


1744 
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SEQ ID NO: 


SEQ ID 


baUl xU NL> : 


buy J. L) 


rT 101TXL.y 


CPrt T1-\ 


or rull*~ 


JNU : or 


oi cunt iy 


NO - 


Hnnlc^t numheir 

wil^w/V^^W 1IUUUJVA- 


NO • in 


xeriy cn 


-Fill 1 - 


Tn i f»1 »*">t" 1 Hp 


of nonhia 
wx umi ^ 


conresTDond. incr 


U. S.S.N. 


JL tiU U1UC 






ngn tide 


SEQ ID NO: in 


09/488, 725 




oeiot ide 
secruencft 

W ^ JJ»^^ 




sequence 


priority 
application 




* 1424 


3210 


4996 


6782 


784CIP2C_11 


1937 


1425 


3211 


4997 


6783 


784CIP2C_12 


1955 


1426 

JL- * W 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C_14 


2185 


1428 

^L A* u 


3214 


5000 


6786 | 


784CIP2C_15 


2889 


1429 


3215 


5001 


6787 


784CIP2CJ.6 


2901 


143 0 

X " w 


3216 


5002 


6788 


784CIP2C_17 


2902 


X ^ -J X 


3217 

J al r 


5003 

VJf V_# 


6789 


784CIP2C_18 


2905 


X *X -J> >C 


3218 

«-> A JL L/ 


5004 


6790 


784CIP2C_19 


2948 


-L *± —> -J 


3219 


5005 


6791 


784CIP2C_20 


2956 


1 4-34 


3220 


5006 

www 


6792 


784CIP2C_21 


2959 


1435 

X "X J <J 


3221 

W ft A» JL 


5007 


6793 


784CIP2C_22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 

A Al *J 


5009 

•p^" w w ^ 


6795 


784CIP2C__24 


2970 




3224 


5010 

*^ W «A* W 


6796 


784CIP2C_25 


2985 


1439 

JL J J 


3225 


5011 


6797 


784CIP2C__26 


2987 


1440 


3226 

—J A «Av W 


5012 


6798 


784CIP2C_27 


2993 


144] 


3227 


5013 

m# W JL 


6799 


784CIP2C_28 


2993 


1442 


3228 

*J tCt 4+ W 


5014 


6800 


784CIP2C_29 


3017 


1 443 

X TC J 


3229 


5015 

W W ^L 


6801 


784CIP2C_30 


3046 


1 444 


3230 

-J JC -J W 


5016 


6802 


784CIP2C__31 


3050 


1445 


3231 


5017 


6803 


784CIP2C_32 


3357 


1446 


3232 

«J> A* 


5018 

m/w W ^ WV 


6804 


784CIP2C 33 


3359 


1447 

X ~ * / 


3233 


5019 

— ^ V ^L ^ 


6805 


784CIP2C_34 


3432 


1 44 R 

X *x *x O 


3P3 4 


5020 

W Jb« ^Jf 


6806 


784CIP2C_35 


3438 


1449 

X*3 ^* J 


3235 

W A W «h' 


5021 

W' W £a *k 


6807 


784CIP2C_36 


3439 


1450 

JL *£ ^? V 


3236 


5022 

■mJ W *J M 


6808 


784CIP2C_39 


3463 


14^5 

X ^ «J 4. 


3237 


5023 


6809 


784CIP2C_40 


3466 


1 4 ^ ? 


3238 

w a -y o 


5024 


6810 


784CIP2C_41 


3466 


1453 


323 9 


5025 

nriP W <b W 7 


6B11 


784CIP2C_42 


3467 


1 4*54 

J. *4 -J ** 


324 0 


5026 


6812 


784CIP2C_43 


3468 


1455 

JL. -J 1 -J 1 


3241 


5027 


6813 


784CIP2C_44 


3483 


14 56 


3242 


5028 

•J' v £i U 


6814 


784CIP2C_45 


3484 


1 4 Ci7 


3543 


5029 


6815 


784CIP2C 46 


3468 


14 "5 8 

JW*x «mI w 


3244 

J <t. "a "a 


5030 

«^ w ^ w 


6816 


784CIP2C_47 


3491 


jL^ J J 


3245 
-j ~ -j 


5031 

W *J J- 


6817 


784CIP2C_48 


3493 


1460 


3246 

•J l 4d^X U 


5032 


6818 


784CIP2C_49 


3494 




3247 


5033 


6819 


784CIP2C_50 


3495 


1462 


3248 


5034 

*^ w w ^ 


6820 


784CIP2C_51 


3496 


1463 

xm: o -j . 


3249 


5035 


6821 


784CIP2C_52 


3503 


1464 


3250 


5036 

w w 


6822 


784CIP2C_53 


3503 i 


1465 

JL w «J 


3251 


5037 


6823 


784CIP2C_54 


3504 


1 1466 


3252 

W W A 


5038 


6824 


784CIP2C_55 


3511 


1467 


3253 


5039 


6825 


784CIP2C_5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C_57 


3536 


1469 


3255 


5041 


6827 


784CIP2C_58 


3546 


1470 


3256 


5042 


6828 


784CIP2C_59 


3548 


1471 


3257 

9*m m ^ * 


5043 


6829 


784CIP2C_60 


3551 


1472 


3258 


5044 


6830 


784CIP2C_61 


3553 


1473 


3259 


5045 


6831 


784CIP2C_62 


3564 


1474 


3260 


5046 


6832 


784CIP2C_63 


3567 


1475 


3261 


5047 


6833 


784CIP2C_64 


3572 


1476 


3262 

\* m-9 


5048 


6834 


784CIP2C_65 


3573 


1477 


3263 


5049 


6835 


784CIP2C_66 


3574 


1478 

JL * 9 W 


3264 


5050 


6836 


784CIP2C_67 


3583 


1479 


3265 


5051 


6837 


784CIP2C_68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2C_70 


3^29 


1482 


3268 


5054 


6840 


784CIP2C_71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C_73 


i 3906 


1485 


3271 


5057 


6843 


784CIP2C_74 


3912 
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SEQ ID NO: 


SEO ID 


j SEQ ID NO: 


SEO ID 


Priori tv 

mm NwS m*m ^m* JF 


SEO ID 


of full- 


NO: of 


of contig 


NO: 


do eke t numbe r 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1486 


3272 


5058 


6844 


784CIP2C_75 


3924 


1487 


3273 


5059 


£845 


784CIP2CJ76 


3928 


1488 


3274 


5060 


6846 


784CIP2C_77 


3935 


1489 


3275 


5061 


6847 


784CIP2C_78 


3959 


1490 


3276 


5062 


6848 


784CIP2CJ79 


3981 


1491 


3277 


5063 


6849 


784CIP2C_80 


3989 


1492 


3278 


5064 


6850 


784CIP2C_81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 


1494 


3280 


| 5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


784CIP2CJ4 


4362 


1496 


3282 


5068 


6854 


784CIP2C_85 


4371 


• 1497 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


1500 


3286 


5072 


6858 


784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


• 1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C_93 


4421 


1504 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


\ 5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


1508 


3294 


5080 


6866 


784CIP2C_98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6869 


784CIP2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2CJL02 


4455 


1513 


3299 


5085 


6871 


784CIP2CJL03 


4462 


1514 


3300 


5086 


6872 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6874 


784CIP2C_106 


4477 


1517 


3303 


5089 


6875 


784CIP2C 107 


4481 


1518 


3304 


5090 


6876 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6879 


784CIP2C_111 


4490 


1522 


3308 


5094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6881 


784CIP2C_113 


4503 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


1525 


3311 


5097 


6883 


784CIP2C_115 


4509 


1526 


3312 


5098 


6884 


784CIP2C_116 


4514 


1527 


3313 


5099 


6885 


784CIP2CJL17 


4516 


1528 


3314 


5100 


6886 


784CIP2C_118 


4522 


1529 


3315 


5101 


6887 


784CIP2C_119 


4525 


1530 


3316 


5102 


6888 


784CIP2C_120 


4527 


1531 


3317 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CIP2C_122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


5106 


6892 


784CIP2C_124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2CJL26 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


3324 


5110 


6896 


784CIP2C_128 


4559 


1539 


3325 


5111 


6897 


784CIP2CJ129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_13 0 


4568 


1541 


3327 


5113 


6899 


784CIP2C__132 


4585 


1542 


3328 


5114 


6900 


784CIP2C_133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_135 


4616 


1545 


3331 


5117 


6903 


784CIP2CJ136 


4617 


1546 


3332 


5118 


6904 


784CIP2C_137 


4618 


1547 


3333 


5119 


6905 


784CIP2CJL38 


4620 
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SEQ ID NO: j 
of full- 
length 

nucleotide j 
sequence 1 


SEQ ID 

NO: of 

full- 

length 

peptiae 

sequence 


SEQ ID NO: f 
of contig 
nucleotide 
sequence 


SEQ ID 

NO : | 

of contig j 

peptide j 

sequence 


Priority 

corresponding 
SEQ ID NO: in 
priority 
application 


XNVJ . xll 

U *? S N 
09/488 . 725 


• 1 C A Q i 

1548 


"j 1 1 a 
J334 


ci on 


6906 j 


784CIP2C_139 


4624 


1 T C A Q 1 

1543 




cn on 


6907 j 


784CIP2C_140 


4632 


J XbbU \ 


•3. i i a 


cioo ; 


6908 | 784CIP2CJL41 


4634 


\ 1 CC1 I 

i 1551 { 


"3 *1 1 O 
O J J / 


D X^».J 


C QflQ 1 


784CIP2C 142 


4638 


I ICO 1 

1 1552 | 


33 3 0 


CI 04 


cqi n 1 


7B4CIP2C 143 


4639 


i n c c ■> \ 

1553 | 


"3 1 1 O 


CT. 


C91 1 i 

D 27 J- X 


784CIP2C 144 


4643 


1554 | 


i a r\ 


1:19c 


can *3 ! 

D71Z 1 


784CIP2C 145 


4644 


1 1 c c c I 

1555 | 


334 J- i 




D7XO 


784CIP2C 146 


4655 


| 1556 | 


•a *i vi o 
3342 


CI Oft 


C Q1 A 1 


7R4CIP2C 147 


4668 


j 1557 | 


3343 




CQ1 C 1 
D ?XD 1 


784CTP2C 148 


4677 


t ■> pro ! 

J 1558 j 


3 3 44 


b Xj u 


CQ1 C I 


7S4CIP2C 149 


4677 


1559 


*3 "3 A C 

334b 


• 3XJX 


CQ1 7 I 
D7X / j 


784CIP2C 150 1 


4677 


1560 


3346 


bXj£ 


CQ1 ft 1 
07XO 1 


7R4CIP2C 152 3 


4682 


1 1561 


334 / 


CI "3. "4 




7A4CIP2C 153 1 


4690 


I 1562 j 


334 o 


CI 


coon i 


784CIP2C 154 1 


4691 


1563 J 


3 J 4 i? 


CI "3.C 


CQ')1 ! 


7ft4PIP2C 155 i 


4727 


1564 


eft 


ci it 
D i j D 


CQOO ( 


7R4CIP2C 156 


4730 


1565 1 


"3 "3 C T 

33 bl 


b x j / 


CQTI j 


7RAPTP7C! 1 57 I 


4734 


1566 


"3 1 C O 

. jib/ 


CI ft 
-> Xj O 


C DO A "1 


784CIP2C 158 


4757 


1 T C C 1 

1 156 / 


J Jbj 


CI ""4 3 


COO c 


7ft4.r , TP9C 159 1 
/ o " V— xr^^ jm ~s i 


4764 


lbbo 


•J ^ C A 


ci An 


C Q O C i 


7ftAPTP9r' 160 ! 


4786 


J 1569 j 


33 bb 


CI /II 

3 x*x x 


COT7 ! 
D272 / 


7 ft APTPOP T Kl 1 


4793 


1 15 /0 


J J JO 


CIA*? 
D X4 « 


C QO ft i 


7R4CTP2C 162 

> O *« > JL JT ^ XOi 


4825 


j 1571 1 


3357 


CI A "> 


CQOQ 


7R4PTP2C 163 


4826 


1572 ! 


3358 


CI A/ 




7R4CIP2C 164 f 


4850 


1573 | 


3359 


CI AC 

bXfsD 


can 1 


7AAPTDOP 1 6C. f 


4853 


j 1574 


3360 


ci AC 
bX4 D 


C Q1 O 
O 


7 R4PTD0P 1 CC 


4855 


1575 


3361 


bx* / 


CQ*3 "J 
O 7JJ 


- 7R4PTP9P 1 fi7 


4856 


i 1576 


3362 


CI A Q 

bX4o 


C Q1 A 


7R4PTP0P 1 68 
/ oiuxr^u xoo 


4867 


J 1577 3363 


bX4S7 


C 0"5 C 


* 7RAPTP9P ICQ 


\ 4869 


1578 1 3364 


c i c n 
bXbu 


caic 
b jJb 


I 7 RAPT POP 170 


4878 


1579 


| 3365 


C 1 Cl 
b XDX 


CQ*JT 


! 7R4PTP0C 171 

/ O ^ Nm»X XT «b ^* X / 


4880 


j 1580 


| 3366 


C 1 CO 

bxbz 


C Q"i ft 


7R4CIP2C 172 


4942 


1581 


j 3367 


ci C"3 


C 07 Q 


7R4PTP0C 173 


: 4945 


| 1582 


3368 


C 1 C A 

bXb4 


c q a n 


7R4CIP2C 174 


4950 


| 1583 


| 3369 


C 1 cc 
bXbb 


o i? *& ± 


f 784CTP2C 175 


J 4952 


j 1584 


i 3370 


ci cc 
bXbo 


fiQ40 


784CIP2C 176 


4954 


| 1585 


3371 


bx:> / 


CQA3 


784CIP2C 177 


! 4958 


i 1586 


j 3372 


c i c a 
bXbo 




\ 784CIP2C 178 


4961 


j 1587 


! 3373 


Ci CQ 




1 784CIP2C 179 


j 5590 


1588 


\ 3374 


ci Cn 


6946 


\ 784CIP2C 180 


( 5599 


1589 


j 3375 


1 CI 61 

3XDX 


6947 


784CIP2C 181 


j 5692 


| 1590 


1 3376 


C1 CO 


6948 


ll 784CIP2C 182 


| 5732 


j 1591 


3377 


CI Cl 
jXDj 


6949 


784CIP2C 183 


j 5765 




[ 33 78 


CI £4 


6950 


784CIP2C 184 


I 5771 


J 1593 


j 3379 


Cl 6C 


6951 


784CIP2C 18S 


j 5774 


| 1594 


1 3380 


CI 

jXOD 


6952 


784CIP2C_186 


S 5793 


j 1595 


3381 


c -1 C7 

jXD / 


6953 


784CIP2C > _187 


j~ 5806 


1596 


3382 


CI fiR 


6954 


784CIP2C_188 


j 5852 


. 1 1597 


3383 


CI fiQ 

3 Xu 7 


6955 


784CIP2C_189 


! 5892 


J 1598 


3384 


CI 7fl 
DX ( U 


6956 


784CIP2C__190 


j 6057 


j 1599 


3385 


CI 71 


6957 


784CIP2C_191 


i .6061 


j 1600 


3386 


CI 79 


6958 


784CIP2C_192 


| 6109 


1601 


3387 


CI 7*3. 


6959 


784CIP2C_193 


{ 6160 


j 1602 


3388 


CI 7A 
bx / 4 


6960 


\ 784CIP2C_194 


j 6297 


1 1603 


3389 


5175 


6961 


784CIP2C_195 


j 6*398 
1 6398 


| 1604 


3390 


5176 


6962 


784CIP2C_196 




1605 


3391 


5177 


6963 


784CIP2C_197 


j 6415 


| 1606 


3392 


5178 


6964 


784CIP2C_198 


| 6448 


j 1607 


3393 


5179 


6965 


784CIP2C_199 


j 6469 


j 1608 


3394 


5180 


6966 


784CIP2C 200 


] 6476 


j 1609 


[ 3395 


5181 


6967 


784CIP2C 201 


[ 6561 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


N0:in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1610 


3396 


5182 


6968 


784CIP2C_202 


6574 


1611 


3397 


5183 


6969 


784CIP2C 203 


6578 


1612 


3398 


5184 


6970 


784CIP2C_204 


6662 


1613 


3399 


5185 


6971 


784CIP2C_205 


6672 


1614 


3400 


5186 


6972 


784CIP2C__206 


6691 


1615 


3401 


5187 


6973 


784CIP2C_207 


6695 


1616 


3402 


5188 


6974 


784CIP2C_208 


6746 


1617 


3403 


5189 


6975 


784CIP2C_209 


6898 


1618 


1 3404 


5190 


6976 


784CIP2C_210 


6938 


1619 


3405 


5191 


6977 


784CIP2C_211 


6943 


1620 


3406 


5192 


6978 


784CIP2C_212 


7110 


1621 


3407 


5193 


6979 


784CIP2C_213 


7200 


1622 


3408 


5194 


6980 


784CIP2C_214 


7212 


1623 


3409 


5195 


6981 


784CIP2C_215 


7218 


1624 


3410 


5196 


6982 


784CIP2C__216 


7249 


1625 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


1628 


3414 


5200 


6986 


784CIP2C_220 


7544 


1629 


3415 


5201 


6987 


784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C_224 


7813 


1633 


3419 


5205 


6991 


784CIP2CJ22S 


7831 


1634 


3420 


5206 


6992 


784CIP2C_226 


7843 


! 1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1637 


3423 


5209 


6995 


784CIP2C 229 


8175 


1638 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2C_233 


8397 


1642 


3428 


5214 


7000 


784CIP2C_23 4 


8466 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 j 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2CJ237 


9106 


1646 


3432 


5218 


7004 


784CIP2C_238 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


3436 


5222 


7008 


784CIP2C 242 


9933 


1651 


3437 I 


5223 


7009 


784CIP2C_243 


9953 


1652 


3438 


5224 


7010 


784CIP2C 244 


9981 


1653 


3439 


5225 


7011 


784CIP2D_1 


746 


1654 


3440 


5226 


7012 


784CIP2D_2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3558 


1656 


3442 


5228 


7014 


784CIP2D 4 


3633 


1657 


3443 


5229 


7015 


784CIP2D_5 


3658 


1658 


3444 


5230 


7016 


784CIP2D 6 


3732 


1659 


3445 


5231 


7017 


784CIP2DJ7 


4004 


1660 j 


3446 


5232 


7018 


784CIP2D_8 


4700 


1661 


3447 


5233 


7019 


784CIP2D_9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D_11 


4894 


1664 


3450 


. 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D__13 


5159 


1666 


3452 


5238 


7024 


784CIP2D__14 


7443 


1667 


3453 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 | 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 f 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D__18 


8734 


1671 


3457 


5243 

• 


7029 


784CIP2D 19 


8756 
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CCf\ TH MA • 
O-EjSw? XU JM^r 1 

n f full- 

lenath 

nucleotide 

sequence 


oc*y XL) 

full- 
1 snath 
peptide 
sequence 


ci?n t r> INTO « 

/-» f f"OT"l t~ T <"T 

nucleoside 
secruence 


lis/ ■ 

of confcio 

peptide 

sequence 


irx. lux j. uy 
rioeleet nunibfir 
corr e sDondi ncr 
SEQ ID NO: in 
priority 
application 


NO* in 
U.S.S .N. 
09/488,725 


1672 


3458 


5244 


7030 


784CIP2D_20 


8818 


1673 


3459 


5245 


7031 


784CIP2D_21 


8844 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


524 7 j 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


8918 


1677 


3463 


5249 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D_26 


8941 


1679 


3465 


5251 j 


7037 


784CIP2DJ27 


8941 


1680 


3466 


5252 j 


7038 


784CIP2DJ28 


B951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 j 


7040 


784CIP2D_30 


9007 


1683 


3469 


5255 


7041 


784CIP2DJJ1 


9012 


1684 


3470 


5256 


7042 


784CIP2D_32 


9013 


1685 


3471 


5257 


7043 


784CIP2D_33 


9025 


1686 


3472 


5258 


7044 


784CIP2D_34 


9053 


1687 


3473 


5259 


7045 


784CIP2D_35 


9054 


1688 


3474 


5260 


7046 


784CIP2D 36 


9054 


1689 


3475 


5261 


7047 


784CIP2D_37 


9113 


1690 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D_39 


9152 


1692 


3478 


5264 


7050 


784CIP2D_40 


9152 


1693 


3479 


5265 


7051 


784CIP2D 41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482 


5268 


7054 


784CIP2D_44 


9231 


1697 


3483 


5269 


7055 


784GIP2D_45 


9236 


1698 


3484 


5270 


7056 


784CIP2D 46 


9236 


1699 


3485 


5271 


7057 


784CIP2D_47 


9303 


1700 


3486 


5272 


7058 


784CIP2D_48 


9309 


1701 


3487 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7060 


784CIP2D_50 


9326 


1703 


34B9 


5275 


7061 


784CIP2D_51 


9339 


1704 


3490 


5276 


7062 


784CIP2D_52 


9348 


1705 


3491 


5277 


7063 


784CIP2D_53 


9376 


1706 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D_55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 


5281 


7067 


784CIP2D_57 


9439 


1710 


3496 


5282 


7068 


784CIP2I>_58 


9485 


1711 


3497 


5283 


7069 


784CIP2D_59 


9493 


1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7071 


784CIP2D_61 


9526 


1714 


3500 


5286 


7072 


784CIP2D_62 


9526 


1715 


3501 


5287 


7073 


784CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CIP2D_64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_65 


9568 


1718 


3504 


5290 


7076 


784CIP2D_66 


9588 


1719 


3505 


5291 


7077 


784CIP2D 67 


9597 


1720 


3506 


5292 


7078 


784CIP2D_68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 


9628 


1722 


3508 


5294 


7080 


784CIP2DJ70 


9649 


1723 


3509 


5295 


7081 


784CIP2DJ71 


9652 


1724 


3510 


5296 


7082 


784CIP2DJ72 


9660 


1725 


3511 


5297 


7083 


784CIP2DJ73 


9662 


1726 


3512 


5298 


7084 


784CIP2D_74 


9725 


1727 


3513 


5299 


7085 


784CIP2D_75 


9746 


1728 


3514 


5300 


7086 


784CIP2D_76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


5302 


7088 


784CIP2D_78 


9790 


[ 1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


| 784CIP2D_80 


9842 


1733 


3519 


5305 


7091 


784CIP2D_81 


9848 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 

length 

peptide 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


| Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


1 SEQ ID 
NO: in 
U. S.S.N. 
09/488,725 


X / o4 


3520 


5306 


7092 


764CIP2D 82 


1 9867 


1 /3b 


3521 


5307 


7093 


7B4CIP2D_83 


, 10010 


"1 C 

1 /oo 


3522 


*™ ft ft ft 

5308 


7094 


784CIP2D_84 


j 10011 


1/37 


3523 


ft ft ft 

5309 


7095 


784CIP2D 85 


j 10052 


173 8 


3524 


5310 


7096 


784CIP2D 86 


| 10057 




3525 


- ft t ^ 

5311 


7097 


784CIP2D 87 


10085 


T 1 A ft 
1 /4U 


3526 


ft 

5312 


7098 


784CIP2D 89 


1 10139 


1/41 


ft r-* ft 

3527 


ft ^ 

5313 


7099 


784CIP2D_90 


10142 


1742 


ft ^ ft* ft 

3528 


5314 


7100 


784CIP2D 92 


j 10165 




ft r - ft ft 

3529 


5315 


7101 


784CIP2D_93 


10173 


1/44 


3530 


? i" ft ^ 

5316 


7102 


784CIP2D_94 


) 10173 


1 /4b 


ft P ft «i 

3531 


5317 


7103 


784CIP2D_95 


j 10273 


1746 


3532 


ft *m ft 

5318 


7104 


784CIP2E_1 


3121 


1747 


ft ^ ft ft 

3533 


5319 


7105 


784CIP2E 2 


j 3628 


1748 


ft. fti j& 

3534 


5320 


7106 


784CIP2E 4 


3673 


1749 


*x ft f— 

3535 


5321 


7107 


784CIP2E 5 


4018 


"1 *7 ET ft 
1 /50 


ft ^ ft 

3536 


5322 


7108 


784CIP2E 6 


| 4467 


1 /51 


ft ^ ft n 

3537 


5323 


7109 


784CIP2E 7 


1 4865 


1 /b2 


ft I™ ft ft 

3538 


5324 


7110 


784CIP2E_8 


j 4916 


1 *7 C O 

1 /53 


ft c ft ft 

3539 


5325 


7111 


784CIP2E 9 


4923 


1 /d4 


3540 


5326 


7112 


784CIP2E 10 


j 4926 


1 /bb 


3 541 


5327 


#*• M H ft 

7113 


784CIP2E 11 


j 4962 


1 /bo 


3 542 


5328 


7114 


784CIP2E 12 


j 4963 


1 /57 


ft I - J ft 

3543 


5329 


7115 


784CIP2E_13 


4964 


•1 T C O 

1758 


3544 


5330 


7116 


784CIP2E 14 


4988 


1759 


3545 


5331 


7117 


784CIP2E 15 


1 5835 


1750 


ft A J* 

3546 


5332 


7118 


784CIP2E 16 


7682 


1 /bl 


3547 


^ ft ft 

5333 


7119 


784CIP2E 17 


7682 


1762 


ft ^ A ft 

3548 


5334 


7120 


784CIP2E_18 \ 


7699 


1763 


ft r* * ^ 

3549 


5335 


7121 


784CIP2E_19 


7707 


1764 


ft P ^ 

3550 


r» ft ^* 

5336 


7122 


784CIP2E 20 | 


7707 


1765 


3551 


5337 


7123 


784CIP2E_21 [ 


7752 


-» '-leg' 

1766 


ft r* r* ft 

3552 


5338 


7124 


784CIP2E_22 | 


8357 


1767 


3553 


5339 


7125 


784CIP2E_23 j 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 f 


9324 


1/69 


ft ^ ^ ^ 

3555 


5341 


7127 


784CIP2F_1 | 


2976 


1770 


ft r~ r* ^ 

3556 


5342 


7128 


784CIP2F_2 j 


3559 


1771 


3557 


5343 


7129 


784CIP2B^3 | 


4021 


1772 


3558 


5344 


7130 


784CIP2F_4 1 


4474 


1 /73 


ft r i— f\ 

3559 


5345 


7131 


784CIP2F 5 ] 


4566 


X /74 


3560 


5346 


7132 


784CIP2F_6 J 


4705 


177b 


ft r* ^ 

3561 


5347 


7133 


784CIP2F 7 | 


4707 


1 /7o 


3562 


5348 


7134 


784CIP2F 8 J 


4712 


_L / / / 


-ac^ 

3 bo J 


5349 


7135 


784CIP2F 9 j 


5008 


1 , 7'7Q 
X f to 


3b64 


c »~ ft 
5350 


7136 


784CIP2F 10 | 


5009 


X 1 1 ¥ 


3bob 


C ft r - «i 

5351 


ft mm 

7137 


784CIP2F_11 j 


5015 


1 T Oft 


■n c i- r 

3566 


r* ft ft 

5352 


7138 


784CIP2?_12 j 


5015 


1781 


3567 


5353 






1*71 A 


1782 


3568 


5354 


7140 


784CIP2F_14 j 


7725 


1783 


3569 


5355 


7141 


784CIP2F_15 j 


8828 


1784 


3570 


5356 


7142 


784CIP2F_16 j 


8830 


1785 


3571 


5357 


7143 


784CIP2F 17 j 


9739 


1786 


3572 


5358 


7144 


784CIP2F__18 | 


9896 
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TABLE 7 



SBQ | 
ID 

NO: j 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first ! 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Hist idine , I«= I soleucine , K=Lysine , 
L=Leucine; M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 

■ 


1131 


AHIiSARLSALILDEVAILPAPQNLSVLSTNMKHLLMWSPVIAPG 
ETVYYSVEYQGEYESLYTSHIWIPSSWCSLTEGPECDVTDDITA 
TVP YNLRVRATLGSQTS /CLEHP /VS I PLIETQ PSLPDL/RME I 
TKDGFHLVIELEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLETMEPGAAYCVKAQTFVKAIGRYSAFSQTECVEVQGEAIPL 
VLALFAFVGFMLI LVWPLFVWKMGRLLQ/ YLLLPRGGS SQTPW 
KITQF 


1 J J V w 

» 1 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
WLPTGDVWS RPDGS YLNKLLITRARQDDAGMY I CLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVF1L 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALS AGPGVGLCEEHGS PAAPQHLLGPGPVAGP KLYPKLYTGHS 
TPHT YTHPPPS CQLNSSHS 


5361 


3 


925 


HEGS I S SAN I LLDDQFQ P KIjT DFAMAH FRSHLEHQ S CT INMT S S 
S SKHLWYMPEE Y I RQGKLS I KTDVYSFG I VIMEVLTGCRWLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGR CAATRAKLR P SMDEVLNTLESTQAS LYFAE DP PTS LKS FRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDiiMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


5362 


2 

• 


4879 

• 


SCQVEGCTRTYNSSQSIGKHMKTAHPDQYAAFKMQRKSKKGQKA 
NNLNTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VSP P I FPAHLASVSTPLLS SMESVINPNI TSQDKNEQGGMLCS Q 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPSPADSGTNSVFSQLENNTNHYSSQIEGNTOSSFLKGGNGENA 
VFPSQVNVANNFS STNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAIIRDGKFICSRCYRAFTNPRSLGGHLSKRSYCKPLDGA 
EIAQELLQSNGQPSLIiASMILSTNAVNLQQPQQSTFNPEACFKD 
PSFLQLLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSEIIIQAL 
ETAGI PSTFEGAEMLSHVSTGCVSDASQVNATVMPNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTS S I EECS S LP VFPTNDLLLKTVEN 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKKGNS 
AS KRRKKVAPPL I APNAS QNLVTSDLTTMGL IAKSVE I PTTNLH 
SNVIPTCEPQSLVENLTQiCLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMALNS CTTSVNSDLQISEDNVTQNFEKT 
LE I IKTAMNSQI IiEVKSGSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGIi 
QKLKLENDLSTPASQCVLINTSVTLTPTPVKSTADITVIQPVSE 
MINIQFNDKVNKPFVCQNQGCNYSAMTKDALFKHYGKIHQYTPE 
MILEIKKNQLKFAPFKCWPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 

PALELRAETQNTHSNVAVI PEKQLIEKKS PDKTESS LQ VITVTS 
EQCNTNALTNTQTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDIiPAFSAEVEE 

ESEAGKESEETETKQTLKEFRCQVSDCSRI FQAITGL I QHYMKL 
HEMTPEEIESMTASVBVGKFPCDQLECKSSFTTOLNYVVHLRAD 
HGIGLRASKTEEDGVYKCDCEGCDRIYATRSNLLRHIFNKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSIjKRGKHVYSIKARN 
DALSECTSRFVTQYPCMIKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTATVSQKEVEKNE*DEMDELTEIjFITKLINEDSTSVETQA 
NTSSNVSITOFQEDNLCQSERQKASNLKRVNKEKNVSQNKKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEEHPASFDWSSFKPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

* * 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C»Cysteine, D=*Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P=Proline / Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VLKQLQEMKPTVSLKKLEVHSNDPDMSVMKDISIGKATGRGQY 


5363 

i 

• 


8066 

• 


703 

* 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRT^SLPHRLNMli 
RGPGPGLLLIAVLCLGTAVPSTGAS KSKRQAQQMVQPQS PVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERP KDSMI WDCTC IGAGRGRI S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKP IAEKCFDHAAGTS YWGETWEKP YQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML\ CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLP FTYNGRTFYS CTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVL VQTRGGNSNGALCHFPFL YNNHNYTDCTS EGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTC I AYSQLRDQCI VDDI T YNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTIKGLKPGWYEGQLI S IQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPIjSRNTF\AEN 
TGLS PGVT Y Y F KVFAVS HGRE S KP LTAQQTT KlA DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPAS EYTVSLVAI KGNQES PKATGVFTTLQPGSS I PPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVE YVYTI QVLRDGQERDA? \ IVNK\ WTPLSP PTNLH 
LEANPDTGVLWSWERSTTPDITGYRITTTPTN'GQQGNSLEEVV 
HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTI IPAV 
PPPTDLRFTN/ 1 LGPDTMRVTW \ AP P PS IDLTNFLVRYS PVKNE 
GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVTVRYYRIT YGETGGNSPVQE FTVPGS KS TATI SGLKPGVD 
YT ITVYAVTGRGDS PAS S KP IS INYRTEI DKPSQMQVTDVQDNS 
ISVKV^SSSPVTCYRVTTT\PKNGPG\PTK7KTAGPDQTEMTI 
EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNTDRPKGLAFTDV 
DVDS I KI AWES PQGQVS RYRVTYS S PEDG IHELF PAPDGEEDTA 
ELQGLRPGSEYTVSWAIiHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTS LS AQWTPPNVQLTGYRVRVT P KE KTGPMKE INLAPDS S 
SVVVSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTITISWRTKrETITGFQVDAVPANGQTPIQRTIKP 
DVRSYTITGLQPGTDYKIYIiYTLNDNARSSPWIDASTAIDAPS 
NLRFIiATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEAT I TGLEPGTE YTI YV I ALKNNQKS E PLIGRKKTDELP 
QLVTLPHPNLHGPE I LDVPSTVQKTPFVTHPG YDTGNGIQL PGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQREKVREEVVTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

coon? rtir^Krr^T nviv VT/^v vtittyd nr UMPnMH/iCPTfT *m\X(~2 v CZ T? it Tf PTi T> 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANIiVATCLPVRASLPHRLNMIi 
RGPGPGLLLLAVLCLGTAVPSTGAS KSKRQAQQMVQPQS PVAVS 
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SEQ 
ID 
NO: 



Predicted ~ 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P»Proline, Q=»Glut amine, R=»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



QSKPGCYDNGKHYQINQQWERTYLGNALVCTCyGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML\ CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYKGRTFYSCTTEGRQIX3HLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNWHNYTDCTSEGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP- 
GHLNS YT I KGLKPG WYEGQL 1 S I QQ YGHQE VTRFDFTTTS TST 
PVTSNT\VTGETTPFSPliVATSES VTE ITASS FWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSliILSTSQTTAPDAPPDPTVDQVDDTSIVVRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDliQFVEVTDV 
KVTIMWTPPESAVTGYRVDVTPVNLPGEHGQRLPLSRNTF\AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQ I TGYRLTVGLTRRGQPRQYNVGP SVS KY 
PLRMLQPASE YTVSLVAI KGNQES PKATGVFTTLQPGSS I PP YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVE YVYTIQVLRDGQERDAP \ I VNK\ WTPLS PPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQOGNSLEEW 
HADQSSCTF\DNLEVPGLEYUVSVYTVKDDKESVPISDTIIPAV 
PPPTDIiRFTN/ILGPDTMRVTWVAPPPSIDLTNFLVRYSPVKNE 
GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\liRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\lAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVA3jNGREESPLLIGQQSTVSDVPRDIiEWAATPTSLLI\SWD 
APAVTVRYYR I TYGETGGNS PVQEFTVPGS KSTATISGLKPGVD 
YTITVYAVTGRGDSPASSKP I SINYRTE I DKPSQMQVTDVQDNS 
ISVKWLPSSS PVTGYRVTTT\ PKNGPG\ PTKTKTAG PDQTEMTI 
EGLQPTVEYWSVYAC2NPSGESQPLVQTAVTNIDRPKGLAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSE YTVS WALHDDMESQPL I GTQS TAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKE INLAPDSS 
SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETT IT I S WRTKTETITGFQVDAVPANGQTP IQRTI KP 
DVRSYTITGLQPGTDYKIYIjYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNWQKSEPLIGRKKTDELP 
QLVTLPHPNLKGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMI FEEHGFRRTTPPTTATP IRHRPRPYPPNVGQE 
ALSQTTI SWAPFQDTSEYI I SCHPVGTDEEPI^FRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREE WTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQSYNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 



5365 



8066 



703 



RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCIiPVRASLPHRLNML 
RGPGPGIiLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYIiGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V»Valine, 
W=Tryptophan, YeTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 




■ 




ITCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNLLQCICTGNGRG 

EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 

DSGWYSVGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQT YG 

GNSNGEPCVLPPTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 

KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTTQNYDADQKFGFCPMAAHEE I CTTNEGVMYR I GDQW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNSYTIKGLKPGWYEGQLIS1QQYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFR VE YELSEEGDEPQ YLVLPS TATSV\NI P\ DLLPGRKYI VN 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 

PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 

ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 

KVT I MWT P P E S AVTG YRVDVI P VNL P G EHGQRLPLSRNTF \ AEN 

TGLS PGVTY YFKVFAVSHGRESKPLTAQQTTKL \DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPGVEYVYTIQVLRDGQERDAP\ IVNK\WTPLSPPTNLH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 

HADQS SCT F\ DNLEVPGLE YNVS VYTVKDDKES VPI SDTI I PAV 

PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 

GRMLQSLS I FFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\WSFT\VHW\IAPRA/TPI 

TGYRI R\ HH PEHF\ SGRPREDR\ VPHSRNSITLTNLTPGTE YW 

S IVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSIjLI \swd 

APAVTVRYYRI T YGETGGNS PVQE FTVPGSKSTAT I SGLKPGVD 

YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQDNS 

ISVKWLPSSSPVTGYRVTTT\ PKNGPG \PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSE YTVS WALHDDMESQPL IGTQSTAI PAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 

S WVSGLMVAT KYEVSVYALKDTLTSR PAQGWTTLENVS PPRR 

ARVTDATETT I T I SWRTKTET I TGFQVDAVPANGQTP IQRTI KP 

DVRSYTITGLQPGTDYKIYLYTLND3SIARSSPWIDASTAIDAPS 

NLRFIiATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 

QLVTL PHPNLHG PE I LDVPS TVQKTPFVTHPGYDTGNG IQLPGT 

SGQQPSVGQQM I FEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALSQTT ISWAPFQDTSEYI I SCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGAT YN 1 1 VEALKDQ QRHKVRE E WT VGNS VNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 


5366 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTS LVLCI PS VPPPVP FPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNT YRVGDTYERP KDSM I WDCTC IGAGRGR I S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
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SEQ 
ID 
.NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, EfeAspartic Acid, E= 
Glutamic Acid, F=Phenylalanine / G=Glycine, 
H=Histidme, Irsiso-Leucine, K=jjysine, 
L^Leucine , M=Me t hionine , N=Asparagine , 
P=Proline, Q=Glut amine, R«Arginine, 
S»Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




■ 




KYS FCTDHT VLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTTQNYDADQKFGFCPMAAHEE I CTTNEGVMYRIGDQW 

DKQHDMGHMMRCTCVGNGRGBWTCIAYSQLRIJOCIVDDITYNVN 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIVVRWSR 

PQAPITGYR I VYS PS VEGS STEIiNLPETANS VTLSDLQPGVQYN 

ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 

KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPGVE YVYTIQVLRDGQERDAP\ IVNK\ WTPLSPPTNLH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 

HADQSS CTF\DNLEVPGLEYNVSVYTVKDDKES VPXSDTI I PAV 

PPPTDLRFTN / 1 LGPDTMRVTW\AP PPS IDLTNFLVRYS P VKNE 

GRMLQSLSIFFIiSDN\AWIjTNLLPGTEYVVSVSSVYEQHESTP 

\LRGRQKTGLDS P \TG I DFS \DITA\NS FT\ VHW \ I APRA/ TP I 

TCYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 

APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 

YT I TVYAVTGRGDSPASS KP I S INYRTE IDKPSQMQVTDVQDNS 

ISVKWLPSSS P VTGYRVTTT\ PKNGPG\PTKTKTAGPDQTEMT I 

EGLQPTVE YWSVYAQNPSGE SQPLVQTAVTN I DRPKGLAFTDV 

DVDS I KI AWESPQGQVSRYRVTYS S PEDGIHELFPAPDGEEDTA 

ELQGLRPGSE YTVS WALHDDMESQPL IGTQSTAI PAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKE INLAPDS S 

S WVS GLWATKYE VS VYALKDTLTSR PAQGVVTTLENVS PPRR 

ARVTDATETT I T I S WRTKTET ITGFQVDAVPANGQTP IQRTI KP 

DWS YTITGLQPGTDYKI YLYTLNDNARSS PWI DASTAI DAPS 

NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVTALKNNQKSEPLIGRKKTDELP 

QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 

SGQQPS VGQQM I FEEHGFRRTTPPTTATP IRHRPRP YPPNVGQE 

ALSQTT I S WAP FQDTS E Y IIS CHPVGTDEE PL(j FR V Pu I e> i. £> A 1 

LTGLTRGAT YNI I VEALKDQQRHKVREE WTVGNS VNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

HEATCYDDGKTYHVGEQWQKE YLGAICS CTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 


5367 

* 


235 


3591 


KKILNMLCKKNIVIEYLADILYEYLYGFCFSGIKKYLIIHVLRL 
ILELWMTRLLLEKSVSLQTQYLLLIVKILSWFPGKEMRHHLQIM 
EVMMRKQDS/RIVGNGSEQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 
GDSEASSPFTPVADEDSWFSKLTYLGCASVNAPRSEVEALRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYSFATAFRRS AKQTPLSATAAPQTPDSD I FTFS VSLE I KEDDG 
vn virc avd yd vnD opp trr .pnn t mcKT v T YVOOTTNKE LAI ERCF 
GLLLSPGKDVRNSDMHLIiDLESMGKSSDGKSYVITGSWNPKSPH 
FQWNEETPKDKVLFMTTAVDLVITEVQEPVRFLLETKVRVCSP 
NERLFWPFSKRSTTENFFLKLKQIKQRERKNNTDTLYEWCLES 
ESERERRKTTASPSVRLPQSGSQSSVXPSPPEDDEEEDNDEPLL 

SGSGDVS KECAEKI LETWGELLSKWHLNLNVRPKQLS S LVRNGV 
PEALRGEVWQLLAGCHNNDHLVEKYRILITKESPQDSAITRDIN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=»Histidine, I»Isoleucine. K=Lvsine. 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=,Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDYFKDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFLA 
AVLLLHMPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQEYIPDLYNHFLDISLEAHMYASQWFLTLFTAKFPLYMVFH 
I IDLIiLCEGIS VIFWALGLLKTS KDDLLLTDFEGALKFFRVQL 
P KR YRS E ENAKKLM E LACNMK I SO KKLKKYEKE YHTMREOOAOO 
EDPIERFERENRRLQEANMRLEQENDDLAHELVTSKIALRiCDLD 
NAEEKADALNKELLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKGISSTKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\QLVEAECKIQD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLSS IKTATGVQGKETC 


5368 


573 


2014 

* 


GAAAGAADPRRGSLGGRTMLDFAI FAVTFLLALVGAVLYLY PAS 
RQAAGIPGITPTEEKDGNLPDIVNSGSLHEFLVNLHERYGPWS 
FWFGRRLWSLGTVDVLKQHINPNKTLD/LF *NHAEVI IKVS I W 
WWQCE * KP \ QRKKLYENGVTDSLKSNFALLLKLPEELLDKWLSY 
P ET0H\ VP h S OHMLG FAM KS VTOM VMGSTFEDDOE V I RFO KNHG 
TVWSE IGKGFLDGSLDKNMTRKKQ YEDALMQLES VLRN 1 1 KERK 
GRNFSQHIFIDSLVCJGNLNDQQILEDSMlFSIiASCIITAKLCTW 
AIWFLTTSEEVQKKLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTP VS AQLQD I EGKIDRF 1 1 PRETLVL YALGWLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLS VLVKRLHLLSVEGQVIETKYELVTS SREEAWITVS KRY 


5369 

* 

* 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRJULRGTMSASFVPNGASLED 

CHCNLFCHjADLTGIKWKKYVWQGPTSAPILFPVTEEDPILSSFS 

RCLKADVLG/ VWRRDQRPERRE \ L * I FWGGEDP \ VLLTLFTMTY 

QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 

VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 

LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQAFKMSDSATKK 

LIGEWKQFYP I S CCLKEMSEEKQEDMDWEDDS LAAVE VLVAGVR 

MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 

MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 

GGKIPRKLANHVVDRWQECNMNRAQNKRKYSASSGGLCEEATA 

AKVAS WDFVEATQRTNCS CLRHKNLKSRNAGQQGQAPSLGQQQQ 

ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 

SQRLV\ I SAP \ DSQ\ VRFSNIR\TNDVAK\ TPQMHGTEMANSPQ 

PP PLS P \HPCDWDEGVTKTPSTPQS QHFYQMPTPDPLVPSKPM 

EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 

PLKVSDELVQQYQIKNQCLSArASDAEQEPKIDPYAFVEGDEEF 

LFPDKKDRQNSEREAGKKHKVEDGTS S VTVLSHEEDAMSLFSPS 

I KQDAPRPTSHARPPSTS L I YDSDLAVS YTDLDNLFNSDEDELT 

PGSKRSANGSDDKASCKESKTGNLDPLSCISTADX.HKMYPTPPS 

LEQHIMGFSPMNMNNKEYGSMDTTPGGTVLEGNSSSIGAQFKIE 

VDEG FCS P KPSE I KDFS YVYKP ENCQ I LVGCSMFAP LKTLPSQ Y 

LPLIKLPEECIYRQSWTVGKLELLSSGPSMPPIKEGDGSNMDQE 

YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 

RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLNSVEP 

ATVPS I PEAHSLYVNLILSESVMNLFK3)CNSDSCCICVCNMNIK 

GADVGVYI PDPTQEAQYRCTCG FSAVMNRKFGNNSGLFFEDELD 

I IGRNTDCGKEAEKRFEALRATSAEHVWGGLKES BKLSDDL ILL 

LQDQCTNLFSPFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 

LEHGRQFMDNMSGGKVDEALVKSSCLHPWSKRNDVSMQCSQDIL 

RMLl^LQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 

TDESPEPLPIPTFLLGYDYDYLVLSPFALPYWERLMLEPYGSQR 

D I AYWLCPENEALLNGAKS F FRDLTAI YESCRLGQHRPVS RLL 

TDGIMRVGSTASKKLSEKLVAEWFSQAADGNNEAFSKLKLYAQV 

CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 

NTPSATLASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 

SNLNSGVSSNKLPS FPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 

QTSALQTAGISGESSSLPTQPHPDVSESTMDRDKVGIPTDGDSH 

AVTYPPAIVVYIIDPFTYENTDESTNSSSVWTLGLLRCFLEMVQ 
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^EQ I Predicted 
ID beginning 
NO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 



TLPPHIKSTVSVQIIPCQYLLQPVKHEDREIYPQHLKSLAFSAP 
TQCRRPLPTSTNVKTLTGFGPGI1AMETAI4RSPDRPECIRI1YAPP 
FILAPVKDKQTELGETFGEAGQKYWLFVGYCLSHDQRWILASC 
TDLYGELLETCIINIDVPNRARRKICSSARKFGIiQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCGISAADSPSILSACLVAMEPQGSFVIMPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPDIINIIiPASPTGSPVHSP 
GSHYPHGGDAGKGQSTDRLLSTEPHEEVPNILQQPLALGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASIiHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVIiEQYNALSWLTCDPATQDRRSCLPIH 

FVVLNQLYNFIMNML 



5370 



1226 



716 



RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDGADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGLGNTPLHLAACTNHVPVI TTLLRGGAR VDALDR 
AGRTPLHLAKSKLNILQEGHAQCIiKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRV?LGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSIiALAESLSLFRACTSLPVG 

GCISWL 



5371 



1331 



167 



5372 



51 



857 



5373 



2814 



346 



IAAMLWKLLLRSQSCRLCSFRKMRSPPKYRPFIACFTYTTDKQS 
S KENTRTVEKLYKCS VDIRKIRR\ * KDGYF * RMKPMLKKLRI / F 
LQELGADETAVASILERCPEAIVCSPTAVNTQRKLWQLVCKNEE 
ELIKLIEQFPESFFTIKDQE^QKLNVQFFQELGLKNVVISRLLT 
AAPNVFHNPV^KNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
NPFIIjIjNSPTAIKETLEFLQEQGFTSFEILQLIjSKLKGFLFQLC 
PRS IQNS ISFSKNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGI S IAQIRETPMVLELTPQI VQYRIRKLNSSGYRI KDG 

HIANLNGSKKEFSANFGKIQAKKVRPLFNPVAPLNVEE 

SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLL I LLFVTELSGAHNTTVFQGVAGQSLQVS CP YDSMKHWGR 
RKAWCRQLGEKGPCQRWSTHNLWLLSFIiRRWNGSTAITDDTLG 
GTLTITLRNLQPHDAGLYQCQSLHGSEADTLRKVLVEVLADPLD 
HRDAGDLWFPG\DLRASRMPMWSTASPGASWKEKSPSHPLPSFS 
SWPASFSSRF*QPAPSGLQPGMDRSQGHIHPVNWTVAMTQGISS 

KLCQG ____ 

VKKTKS I FNSAMQEMEVYVENIRRKFGVFNYS P FRTP YTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRR I SbSDMPRS PMSTNSSVHTGSDVEQDAEKKATSSHFS ASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQIjSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVIiTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE\TAP 
AVQRSCGTS STVQQKEI TQS PSTST I TLVTSTQSS PLVTSSGSM 
STLVSS VNGDLP IGTASADVAADIAKYTSKL\MDAIKGTM\TE I 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQER0RLIAEVKKQLELEKOQAVDETKKKQWC 
ANFKKEAIFYCCWNTSYCDYPCQ\QAHWPEH\MXSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILIiGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 



VKKTKS I FNSAMQEMEVYV ENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDIjKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTrESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSE YISDDEQKS »GTSQEDTEDKEGCQMpKEPSAVKKKPKP 



5374 



2814 



346 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N«Asparagine , 
P= Proline, QaGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






TNPVE I KEBLKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEP KE PSPKQDWGKTPPSTTVGSHS P PETPVLTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE \TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITIiVTSTQSSPLVTSSGSM 
S TLVS S VNGDLP I GTAS ADVAAD I AKYTS KL \MDAI KGTM \TE I 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWIiHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIPYCCWOTSyCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DBKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


HIFLAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTLGKES 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWA\LWLHTR 
RCQA/RGL PLPCPECGRR FRHAPFLALHRQVHAAATPDWGFACH 
LCGQS FRGWVALVLHLRAHSAAKAGPFACPKMARDAFWRRKAAS 
SSILRRCHPSRPRGPRPFICGNCGRSILPTWDQ/IjKVAHKRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKRFRHK\PNI*IRSHAACrSGERPHQ/CSRECG\KRFTNKPY 
LTS \HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDPIEAPPSLYSCDDCGRSFRLERFLRAHQRQHTGERPFTCAEC 
GKNFGKKTHLVAHSRVHSGERPFRIiARKCGRRFLPRASQSGGRN 
SAE PNAPRFG P FVCPDCGKAFRHKP YLAAHRP I ATPAEKP YVCP 
DCRKAFSQKSNLXVSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKSHIRDGAFCCAICGQTFDDEERLIiAHQKKHDV 


5376 


4504 

7 

■ 

• 


591 


VSTFSLCLWPAGGGGRGRVSNMAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEVI GKGHRGTVAYVGATLFATGKWVGVI LDEAKG 
KNDGTVQGRKYFTCDEGHGI FVRQSQ XQVFEDGADTTS PETPDS 
S AS KVLKREGTDTTAKTS KLRGLKP KKAPTARKTTTRRPKPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPIIPTP 
VLTSPGAVPPLPSPSKEEEGLRAQVRDLEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADLQRRLKEARKBAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLQQEVEALKER 
VDEIiTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEWRQQRERLQEELSQ 
AESTI DELKEQVDAALGAEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDELQENARBTELELRBQLDMAGARVREAQKRVEAA 
QETVADYQQTIKKYRQLTAHIjQDWRELTNQQEASVERQQQPPP 
ETFDFKIKFAETKAHAKAIEMELRQMEVAQANRHMSLIjTAFMPD 
SFLRPGGDHDCVLVLLLMPRIjICKAELIRKQAQEKFELSENCSE 
RPGLRGAAGEQIiS FAAIGLVY\SLMPAAGHRYHRY* CHALSQCR 
LD\VYKKVGSLYPEMSAHERSLDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYSIHLAEQPEDCTMQLADHIKFTQSALDCMSVEVG 
RLRAFWGQEATDIALLLRDLETSCS\DIRQFCKKIRRRMPGT 
DAPG1 PAALAFGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQLI 
APLAENEGLLVAALEELAFKASEQIYGTPSSSPYECIiRQSCNIL 
I STMNK\LVTAMQEGEYDAERPPSKPPP \ VELRAAALRAE I TDA 
EGLGLKLEDRETVIKELKKSLKIKGEELSEANVRLTLLEKKLDS 
AAKDADERIEKVQTRLEETQAIiLRKKEKEFEETMDALQADIDQL 
EAEKAELKQRLNSQSKRTIEGIiRGPPPSGIATLVSGIAGEEQQR 
GAI PGQ APGS VPGPGLVKDS PLLLQQ I S AMRLH I SQLQHENS 1 1» 
KGAQMKAS LASLP PLHVAKLSHEGPGS ELPAGALYRKTSQLLET 
IjNQLS THTHWD I TRTS PAAKS PSAQLMEQVAQLKS LiSDTVE KL 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQLHQLHSRLIS 


5377 


762 

• 


1106 


DVPCKRVLPAEAQEKGQLTLSCGESGEEG\F*YHEVRQAEGES* 
/WFGPNVRLVHTQLKTKKPSGTLKAKFYLHTGSTKFAARISCTK 

S S * WPG YDGW WGGQY I F I FRGMRWEEQ P 


5378 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCIjPAIj 
R FI AT P RLS AMPH I DNDVKLD F KDVLLR PKRSTLKS R S E VDLTR 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=lsoleucine,' K=Lysine, 
L=Leucine r M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\=possi ble nucleotide insertion) 

S FSFRNSKQTYSGVP I IAANMDTVGTFEMAKVLCKS * VPGSFWD 
VPQMGOTFLI YKLFTLKWKMLLLSVLLPAS I LVAEKFS LFTAVH 
KHYSLVQWQEFAGQNPDCLBHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNVVTGEMV 
EELILSGADIIKVGIGPGSVCTTRKKTGVGYPQIjSAVMECADAA 
HGLKGHIISDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
ELIERIX3KKYTCLFYGMSS*I\AM\KECYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 

VNPIFSEAC 



5379 



2009 



664 



QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
S FS FRNSKQTYSGVP I IAANMDTVGTFEMAKVIjCKS * VPGSFWD 
VPQMGCVFLI YKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNWTGEMV 
EEL I LSGAD 1 1 KVG IGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I SDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
ELIERDGKKYKLFYGMSS*I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 

PSRAGGAERGRAAAARSPGGSAAGWECPSVLDEAGACTMSSCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIVVTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCI CPSLP YSP VS S PQSS P 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDE IGKGS YGWKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLRYQKII 
HXRDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPA 
FMAPESLSETRKI FSGKALDVWAMGVTLYCFVFG* CP FMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIVVPEI 
KLHPWWRHGAEPLP SEDENCTLVEVTEEEVENSVKHIPS LATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 

PPL VGAPGS H FCFLN I ALLR YNSHTM 

PSRAGGAERGRAAAARS PGGSAAGWECPS VLDEAGACTMS S CVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SF I WTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS I TGMQDCVQLNQYTLKDE IGKGS YGWKLA 
YNENDNTYYAMKVLS KKKLIRQAAFPRRPPPRGTRPAPGGC I QP 
RGPI \EQVYQE IA\ILKKIiDHPNW\KLVEVL\DDPNEDHLYMV 

f\elvnqgpvmevptlkplsedqarfyfqdlikgieylhyqkii 
h\rd i kpsnllvgedghi kiadfgvsnefkgsdallsntvgtpa 
fmapeslsetrkifsgkaldvwamgvtlycfvfg*cpfmderim 

CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGAEPLPSEDENCTLVEVTEEEVENSVKHIPSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM . 



5380 



2050 



5381 



2050 



5382 



1536 



2 03 | GARGSQQDAPALQEAEVRGPERAQPARGRMTKARLFRLWLVLGS 

VFMILLIIVYWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTADSDVDEFLDKFLSAGVKQSDLPRKETEQPPAPGSMEESVRG 
YDWS PRDARRS PDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSELSHLIVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLH 
RGAP YRDPLRI PREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDPFVRLISAFRSKFELENEEF/*PQVRRAHAAAV 
RQPHQPARLGARGLPRWPQ\ VSFANF I QYLLDPHTEKLAPFNEH 
WRQVYRLCHPCQIDYDFVGKLETLDEDAAQLLQLLQVDLAAPLP 
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SEQ 
ID 
NO: 


, Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine. I=Isoleucine K— L\z«5in** 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, QsGlutamine. R=Arcrinine. 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQLYKLYEADFVLFGYPKP 
ENLLRD 


5383 

• 


45 

• 


5250 

• 


VERLI^CRNSK^TWRMLISK^IPWRIU,QGISFGMYSAEELKKLS 

VKS ITNPR YLDSLGNPSANGLYDLALGPADS KEVCSTCVQDFSN 

CSGHLGHI ELPLTVYNPLLFDKLYLLLRGSCLNCHMLTCPRAVI 

HLLLCQLRVLEVGALQAVYELBRILSRFLEENADPSASEIREEL 

EQYTTE I VQNNLLGSQGAHVKNVCES KSKL I AL FWKAHMNAKRC 

PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 

IGKRGYLTPTSAREHLSALWKNEGFFLNYLFSGMDDDGMESRFN 

PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWL 

IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSFLSTLPGQ 

SLIDKLYKIWIRLQSHVNIVFDSEmKLMMDKYPGIRQILEKKE 

GLFRKHMMGKRVD YAARS VICPDMYINTNB IG I PMVFATKXjTYP 

QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 

QREAVAKQLIjTPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 

RPSIQAHRARILPEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 

ELGRAEAYVLACTDQQYLVPKDGQPLAGLIQDHMVSGASMTTRG 

CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 

TLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 

ESQVIIREGELLCGVLDKAHYGSSAYGLVHCCYEIYGGETSGKV 

LTCLARLFTAYLQLYRGFTLGVEDIIjVKPKADVKRQRIIEESTH 

CGPQAVRAALNLPEAASYDEVRGKWQDAHLGKDQRDFNMIDLKF 

KEEVNHYSlSrEINKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 

MQISCLLGQIELEGRSTPLMASGKSLPCFEPYEFTPRAGGFVTG 

RFLTG I KPPE F FFHCMAGREGLVDTAVKTSRSG YLQRCI I KHLE 

GLWQYDLTVRDSDGSWQFLYGBDGLDIPKTQFLQPKQFPFLA 

SNVEVIMKSQHLHEVLSRADPKKALHHFRAIKKWQSKHPNTLLR 

RGAFLSYSQKIQEAVKALKLESENRNGR/RPWDS/G/RMLRMWY 

ELDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 

YSQBWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 

LLAAQS IGEPS TQMTLNTFHFAGRGEMNVTLG I PRLREILMVAS 

ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 

ESFCMEEKQNKFQVYQLRFQFLPHAYYQQEKCLRPEDILRFMET 

RFFKLLMES I KKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 

EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 

E EREGE ENDDE DMOE ERNPTTR 15G AR KTOEODRE VOTj /fJH * OOP V 

PSRP PDAAPETHPQPGAPGA\SAMERRVQAVRE IHPFI DDYQYD 
TEESLWCQVTVKLPLMKINFDMSSLVVSLAHGAVIYATKGITRC 
LLNETTNNECNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 
AI ANT YG IEAALRVI EKE IKDVFAVYG IAVDPRHLS LVADYMCF 
EGVYKPLNRFG IRSNSS PLQQMTFETS FQFLKQATMLGSHDELR 
S PS ACLWGKWRGGTGLFELKQPLR 


5384 


196 


886 


QSCGQRLPTVL*L*GPPGSCPCILSLF\PGRPHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGIYFFSLNVHSWWYKETYVHIMHNQKEAVILYAQPS 
ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 
SGHLIKAEDD 


5385 


326 


799 


LMVPRTKKE1APAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
S PTFRRPKTL* LRRQPKYP WKSTPRRNKLDHHVT I KFPLTTE * A 
VKKIENNSI<LVFTVDVKANKHQ IKQAVKK/ LCDIDVAKVNTLI Q 
SDGERKAYVRLAPDYDALWATXIGIT 


5386 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKKI ENNSLLVFTVDVKANKHQ IKQAVKK/LCDIDVAKVNTLIQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
SDDLWPGFFELVVRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
ALFFPE^m^ASLGAAWVADGVQCDR^VVNGI I ATVWS WI I IAA 
TWS III VFDPLGGKMAPYSSAG PS HLDS HDS S QLLNGLKTAAT 
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| SEQ 
ID 
NO: 

H 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 1 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide j 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= j 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysxne, 
L=Leucine, M=Methionine, M=Asparagine, 1 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=stop j 
Codon, /^possible nucleotide deletion, 
\=rpossible nucleotide insertion) \ 




■ 




SVWETRI KLLCCCIGKDDHTRVAFSS TAELFSTYFSDTDLVPSD j 
IAAGLALLHQQQDNIRNNQEPAQWCHAPGSSQEADLDAELKNC | 
HHYMQFAA7^AYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLQL/ CTSAP HjHTHRAAVQGLHPRQLPWTRFTELP FLVA 1 
LDHRKES VWAVRGTMS LQDVLTDLSAESEVLDVECEVQDRIjAH 
KG I SQAAR YVYQRL INDG I LSQAFS I APE YRLVI VGHS LGGGAA 
ALLATMVRAAYPQVRCYAFSPPRGLWSKALQEYSQSFIVSLVLG [ 
KDVI PRLSVTNLEDLKRRI LRWAHCNKPKYKI LLHGLWYELFG 
GNPNNLPTELDGGDQEVLTQPLLGEQSIjLTRWSPAYSFSSDSPL 

dsspkypplyppgri ihlqeegasgrfgccsaahysakwsheae 
fskiligpkmltdhmpdilmraldswsdraacvscpaqgvssv 

DVA 


5388 


1569 


753 

• 


tadggaggggrrqagvrrhylypftggyrrrraacqaerpaars 1 
kdtdlaayqkgnlgvqlrnmaqetnhsqvpmlcstgcgfygnpr 
tngmcsvcykehlqrqnssngrisppvqctdgsvpeaqsaldst 
sssmqpspvsnqsllsesvassqldstsvdkavpetedvqasvs 

DTAQQPSEEQSKSIjE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 

twytvtqmytialtitkqmlknfvfqqefksfgsfhqqlleyk 
ilehlqtkn 1 


5389 


1569 


753 


tadggaggggrrqagvrrhylypftggyrrrraacqaerpaars 
kdtdlaayqkgnlgvqlrnmaqetnhsqvpmlcstgcgfygnpr 
tngmcsvcykehlqrqnssngrisppvqctdgsvpeaqsaldst j 
sssmqpspvsnqsli^esvassqldstsvdkavpetedvqasvs 
otaqqpseeqsksde\nrnkkriavscagrkwdllglnagvemf 

TVVYTVTQMYTIALTITKQMIiKNFVFQQEFKSFGSFHQQIiLEYK 
ILEHLQTKN j 


5390 

1 


217 


1332 

• 


" EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDSITRDDNIAAFKRIRLRPRYLRDVSEVDTRTTIQGEEI 
SAP I C IAPTGFHCLVWPDGEMSTARAAQAA\GI CYI TSTFAS CS 
LEDlVIAAPEGLRWFQLYVHPDLQLNKQIjIQRVESLGFkALVIT 
LDTPVCGNRRHDIRNQIiRRNLTLTDLQS PKKGNAI P YFQMTP I S 
TSLCWNDLSWFQS ITRLP I ILKGILTKEDAELAVKHNVQGI IVS 
NHGGRQLDEVLAS IDALTEWAAVKGKI E VYLDGGVRTGNDVLK j 
ALALG AKC I FLGDAI LWALAS KGEHGVKEVLN I LTNE FHTSMA\ 
LTGCRSVAEINRNLVQFSRIj J 


5391 


1 


1292 

■ 

* 


VKKAAGRSRGPPTAGGQRCEEAPGTVMERRLGVRAWVKENRGSF 
QPPVCNKL^QEQLKVMFVGGPNTRKDYHIEEGEEVFYQLEGDM 1 
VLRVLEQGKHRDWIRQGE I FLLPARVPHSPQRFANTVGLWER 
RRLETELDGIiRYYVGDTMDVLFEKWFYCKDLGTQLAPI IQEFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREIi 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDVWLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLS WGPS Y \AW\ERTQGS VALSVT\Q 
D PACKKS PWGEPSCHGLKAATGVPSTLEVPSLPNNS PS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVLPGGLPPAPLLP I PLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 

• 


IRGSNAQKWGASGSGGAGPQPDPAGPGGVPALAAAVLGACEPR 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 
FIHKPAHGWLHPDARVLGPGVSYWRYMGCIEVLRSMRSLDFNT 
RTQVTREAINRLHEAVPGA/RGSWKKKAPNKALASVLGKSNLRFA 
GMS IS IHISTDGLSLSVPATRQVIANHHMPSISFASGGDTDMTD j 

YVAYVAKDP INQRACHILECCEGL\ AQS 1 1 STVGQAFELRFKQY 
LHSPPKVAIiPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 1 
GGLVDSRLAIiTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 
■Dr'rvr'VAmiinTiPmaDmTR'PUT WKTTnnT.nAPEPEDSPKKDLFDMR 
PFEDALKLHECSVAAGVTAAPLPIiEDQWPSPPTRRAPVAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHIiLLVDPEGVVRTKDVLFESISHLIDHHLQNGQPIVAAE j 

SEIjHLRGWSREP J 


5393 


2 


982 


GGDSAGMTMETQMSQOTCPRNLWIiLQPIiTVI^IiliASADSQAAAP 

pkavlkleppwinvlq\edsvtltcqgapqp/ersdsiqwfhng 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e aponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
LaLeucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGI IVAWIATAVAAIVAAWALI YCRKKR I SAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKN I YLTLP PNDHVNSNN 


5394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
\NLI PTHTQPS \ YRFKANNN\DSGE YTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSS PMGI IVAWIATAVAAIVAAWALI YCRKKR I SAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAP TDDDKNI YLTLP PNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTIjSRSLSIjQASDFDGAS 

SSGNPEAVAIAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 

KPTETPPVKBTQQEPDEESLVPSGENLASETKTESAKTEGPSPA 

LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGEVTPSDSGGQBDSPAKGHSVRLEFDYSEDKS 

SWDNQQENPPPTKK1GKKPVAKMPLRRPKMKKTPEKLDNTPASP 

PRS PAEPNDI PIAKGTYTFD I DKWDDPNFNPFS STSKMQESPKL j 

PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 

ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 

TPAATPETPPVISAVVHATDEEKLAVTNQKWTCMTVDLEADKQD 

YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 

DDAPKKQALYLMFDTSQBSPVKSSPVRMSESPTPCSGSSFEETE 

ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 

SEAIEITAPEGS FASADALLS RLAHPVS LGGALD YLEPDLAEKN 

PPLFAQKLQREAAHPTDVS I SKTALYSRIGTAEVEKPAGLLFQQ 

PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 

YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 

\ SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 

YQALKVHA\EEKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 

CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5396 

• 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDS GGQEDS PAKGHSVRLEFD YS EDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAEPNDI PIAKGTYTFD I DKWDDPNFNPFS STSKMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKS PKRSPLSDPPSQDP 
TPAATPET PPVI SAWHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
S EAI E ITAPEGS FASADALLS RLAHPVS LCGALDYLE PDLAEKN 
PPLFAQKLQREAAHPTDVS I SKTALYS R I GTAEVEKPAGLLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTI AQM I EDEQREKSVS\ HQTVQQLVLEKEQA\LADLNSVEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKI CDELIAKMGKS 


5397 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVP I SKSTLSRSLSLQASDFDGAS 
S SGNPEAVALAPDAYSTGSSSAS STLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAEPNDI PIAKGTYTFD I DKWDDFNFNPFSSTSKMQESPKL 
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| SEQ 1 

! ID 
1 NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hist icline , i=ASOj.eucine, js.— jjysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«=Proline, Q=»Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVI SAWHATDEEKLAVTNQKWTCMTVDLEADKQD • 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP | 
SEAIE TTAPEGS FASADAJjlio-rcijAH v V oIjUTjAJUU 1 jjaKUljuftaJtu* I 
PPLFAQKLQREAAHPTDVS I SKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQM I EDEQREKSVS \HQTVQQLVLEKEQA\LADLNSVEK 
\S LADLFRRYE KMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE \ I AQVRGKAQQEQAAHQASLAERSS 
CRV\DAIiERTLEQKNKEIBELTKICDELIAKMGKS 


5398 


56 

; 


5426 


SGE VCRMESNFNQEGVPRP S YVF S ADP I ARPSE I NFDG IKLDIjS 

HEFSLVAPNTEANSFESKDYLQVCLRIRPFTQSEKELESEGCVH 

ILDSQTWLKEPQCILGRLSEKSSG\QM\AQKFSFFPGFLGPAT 

TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGIiTNSGKTYTFQGTE 

ENIRILPRTLNVIiFDSLQERIiYTKMNLKPHRSREYLRIiSSEQEK 

EE IAS KSALLRQI KEVTVHNDSDDTLYGSLTNSLNI SEFEES I K j 

DYEQANLNMANSIKFSVWVSFFEIYNEYIYDLFVPVSSKFQKRK 

MLRLSQDVKGYSF I KDLQW IQVSDS KEAYRLLKLGI KHQS VAFT 

KLNNASSRSHSIFTVKILQIEDSEMSRVIRVSELSLCDLAGSER 

TMKTQNEGERIiRETGN I NTSLLTLGKC I NVLKN SE KS KFQQHVP 

FRESKLTHYF /QS FFMGKGKI CMI VNI SQCYLAYDETLNVLKFS 

AIAQKVCVPDTLNSSQEKLFGPVKSSQDVSLDSNSNSKILNVKR j 

ATI SWEN3 LEDLMEDEDLVEELENAEETED / VGETKLLDEDLDK 

TLEENKAFISHEEKRKLDDLIEDLKKKLINEKKEKLTLEFKIRE j 

EVTQEFTQYWAQREADFKETLLQEREILEENAERRLAIFKDLVG 

KCDTREEAAKDI CATKVETEEATACLELKFNQI KAELAKTKGEL 

IKTKEELKKRENESDSLIQELETSNKKIITQNQRIKELINIIDQ 

KEDTINE FQNLKSHMENTFKCNDKADTS SL I INNKLI CNETVEV 

PKDSKSKICSERKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 

SEEVRPNIAEIEDIRVLQENNEGLRAFLLTIENELKNEKEEKAE 

LNKQIVHFQQELSLSEKK^TLSKEVQQIQSNYDIAIAELHVQK 

SKNQEQEEKIMKLSNE IETATRS ITNNVSQ IKLMHTKIDELRTL 

DSVSQISNIDLLNLRDLSNGSEEDNLPNTQLDLLGNDYLVSKQV 

KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 

Q I E KLQAE VKG YKDENNRLKE KEHKNQDDLLKE KETL I QQLKEE 

LQEKNVTLDVQIQHWEGKRALSELTQGVTCYKAKIKELETILE j 

TQKVERSHSAKLEQDILEKESI IliKLERNLKEFQEHLQDSVKNT 

KDLNVKELKLKEEITQLTNNMDMKHLiLQLKEEEEETNRQETEK 

LKEELSASSARTQN\LNADLQRKEEDYADLKEKLTDAKKQIKQV 

QKEVSVMRDEDKLLRIKINELEKKKNQCSQELDMKQR\TIQQI*K 

EQLINQKVEEAIQQYERACKDLNVKEKI IEDMRMTLEEQEQTQV 

EQDQVL\EAKLEEVERIiATEIiDRWRVKCNDLETKNNQRSNKEHE 

NNTDVLGKLTISnL«QDELQESEQKYNADRKKWLEEKM^LITQAKE^ 

ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 

ERDQLVAALEIQLKAxjISSNvQlUJWciiiiUiJiucx lanioMaxvi | 

MDIKPKRI S SADPDKLQTEPLSTS FE I SRNKI EDGS WLDSCEV 

STEOTQSTRFPKPELEIQFTPLQPNKMAVKHPGCTTPVTVKIPK 

ARKRKSNEMEEDLVKCENKKNATPRTNLKFPISDDRNSSVICKEQ 

KVAIRPSSKKi YoLiKoyAox J.l3vJNiifti jvtuvcvj i iJV^ r ksu*. i 

PS ILQS KAKKI ietmss SKLSNVEASKENVSQPKRAKRKLYTSE 

ISSPID1SGQVILMDQKMKESDHQIIKRRLRTKTAK 


i r~ *^ r\ r\ 

5399 


705 


i n 


" RPT?MAKFT>SODOINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 1 
AS PTPGEVQRHLQTHGI DGNGELDFSTFLT IMHMQIKQEDPKKE 
I LLAMLMVD KE KKG YVMAS DLRS KLTS LGEKLTHKEV \ DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY j 


j 5400 


931 


248 


SHCSSGME I P PTNYPASRAALVAQNY INYQQGTPHRVFEVQKVK | 
QASMED IPGRGHKYRLKFAVEEI I QKQVKVNCTAEVLYPSTGQE j 
TAPEVNFTFEGETGKNPDEEDNTFYQRLKSMKEPIjEAQNI\PDN 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
WaTryptophan, Y»Tyrosine, X= Unknown , *aStop 
Codon, /-possible nucleotide deletion,. 
\sspossible nucleotide insertion) 








FGNVS PEMTLVLHLAWVACG Y 1 1 WQNSTEDTWYKMVKIQTVKQV 
QRNDD F I ELDYT I LLHN IASQE 1 1 PWQMQ VLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAFLAPRDFPFPPKLLIHPQAWRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 
KWAPRQDDMLFYVRRKLAYSGSESGADGRKAAEPEVEVEVYRRD 
SKKLPGLGDPDIDWEESVCLNliIIiQKLDYMVTCAVCTRADGGDI 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
VFSDMTVGKGEMVCVEIiVASDKTOTPQGVI FQGS IR YEALKKVY 
DNRVSVAARMAQK\ MS FGFS KYSNMEF\ VR\ MKGPQGKGHAEMA 
VSRVSTGDTS PCGTEEDSSPAS PMHERVTSFSTPPTPERNNRPA 
FFSPSLKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHNAT 
NLRSRSLSGTGRSLVGSWLKIiNRADGNFLLYAHLTYVTLPLHRI 
LTD I LEVRQKP ILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADILNSYAGLACVEEPNDMITESSLDVAEEEIIDDDDDDITL 
TVEASCHDGDETIETIEAAEALLNI4DSPGPMLDEKRINNNIFSS 
PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNI SVKKKNKDGKGNTIYLWEFLL 
ALLQDKATCPKYIKWTQREKGIFKLVDSKPVSRLWRKHKNKPVD 
MNYEPMGRALRYYYQRGIIoAKVBGQRLVYQFKEMPKDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 
RNQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VSV\ASSPSFS\ATAPWTLFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNEItLEPNSF 


5403 

■ 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGADILNSYAGLACVEEPNDMITESSLDVAEEEIIDDDDDDITL 
TVEAS CHDGDETI ETIEAAEALLKMDS PGPMLDEKRINNNI FSS 
PEDDMWAPVTHVS VTLDG I PEVMETQQVQEKYADS PGASS P EQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLQDKATCPKYIKWTQREKGrFKLVDSKPVSRLWRKHKNKP\D 
MNTOPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDL1YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQS I R\TIQAPTQVPVWSP 
RNQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFI lqai PSSQP 
MTVLKENVMLQSQKAGSPPS IVLGPARV\QQVLTSNVQTI CNGT 
VSVXASSPSFSXATAPWTIiFIjLGSSQLVAHPPGTVITSVIKTQ 

BTKTIjTQEVEKKES edhlkente kteqqpqp yvmws s sngfts 

QVAMKQNELLE PNS F 


5404 


187 


1111 


LPVTLIFAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRIIYDYGT 
DNFEBS I FSQDYEDKYLDGKNI KEKETVI I PNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSLAENQLLKLPVLPPKLTLFNAKYl^KIKSRGIKANAFK 
KLNNLTFLYLDHNAIiESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWT QQPSLDSRPRLDYERE IQPTA 
ILSLDQIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHEII 
P INVNNNYEHRHTSHLGHAVLPSNARGP ILSRSTS TGSAASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQHKFICEQCGKCKCXSECTAPRTLPSCIiACNRQCLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCPCLESCPSRGQGKPS 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 
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SEQ I Predicted 
ID beginning 
NO: J nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L^Leucine, M^Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



5407 



659 



5408 



2745 



6128 



5409 



2745 



6128 



RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHI KDP FQKATLRR YKNCEHKNVHLKKDHKS VDE 
CKVHRGGYNGFNQCLPATQSKIFLFDKCVKAFHKFSNSNRHKIS 
HTEKKLFKCKECGKS FCMLSHItAQHKI IHTRVNFCKCBKCGKAF 
NCPS I ITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI IRTGEKFYKCKECAKAFNQSS 
NLTEHKKIHPGEKPYKCEECGKAFNWPSTLTKHKRIHTGEKPYT 
CEECGKAFNQFSNLTTHKRIHTA\EKFYKCTECGEAFSRS\SNL 
TKHKEIHTEKKPYKCEECGKAFKWSSKIiTEHKLTHTGEKPYKCE 
KCGKAFNCPSIITKHNRINTGEKPYTCEECGKVFNWSSRIiTTHK 
KNYTRYKLYKCEECGKAFNKSSILTTHKKIHIEKKFYKCEECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNXiTTHKKIHTGEKFYKCEECGKAFTQ 
SSNLTTHKKIHTGGKPYKCEECGKAFNQFSTIiTKHKIIHTEEKP 
YKCEECGKAFKWSSTLTKHKI IHTGEKPYKCEECG\KAFKLSST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 

NKIHTGEKL YKPEDVTVI LTTPQTFSN I K 



RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGIiAVIjIL 
All IiLQGTLAQS IKGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVS I FDLAVGVYFIAGTGMEFR 
QS \RASDKQTLLP\NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 

QGS KGTCHPQAQQP WDEGVWQEAPSQS E P WGQSQE P PTMPQRIiP 

HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 

APRPVPASRGGKTLCKGYRQAPPGP PAQFQRP ICS AS PPWAS RF 

STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 

RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 

VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCFM 

NS S I QCVSNTQPLTQYF I SGRHLYELNRTNPIGMKGHMAKCYGD 

LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 

LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 

IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 

E ITV I KLDGTTPVRYGLRLNMDE KYTGLKKQLSDLCGLNS E Q I L 

LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSP 

TQTDFS SS PSTNEMFTLTTNGDLPRP I F I PNGMPNTWPCGTEK 

NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFLSSQKNR 

PSLFGMPLIVPCTVHTRKKDLYDAVW IQVSRLAS PLPPQEASNH 

AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 

DRAFIGNAYIAVDWHPTALHIiRYQTSQERWDEHESVEQSRRAQ 

VE PINLDS CLRAFTS EEELGENEMYYCS KCKTHCLATKKLDLWR 

LPPILIIHLKRFQFVNGRWIK5QKIVKFPRESFDPSAFLVPRDP 

ALCQHKPIiTPQGDEIjSEPR ILAREVKKVDAQSS AGEEDVLLS KS 

PSSLSANIISSPKGSPSSSRKSGTSCPSSKHSSPNSSPRTLGRS 

KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICEIADALSRGH 

VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQLG 

NHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 

PNCKWYCYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPK 

TDGKKMADTSSMDEDFESDY\EKYCVLQ 

QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRLP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRnAAKREQGSL 
APRPVPASRGGKTLCKGYRQAP PGPPAQFQRP I CSAS PPWASRF 
STPCPGGAVREDTYPVGTQGVPSliALAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANSSKlDRHKVPTEKGATGIiSNLGNTCFM 
NSS IQCVStJTQPLTQ YFI SGRHLYELNRTNP IGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
E I TVI KLDGTTP VRYGLRLNMPEKYTGLKKQIiS DLCGIiNSEQI L 
LAEVHGSN I KNFPQDNQKVRLS VSGFLCAFE I P VPVS PI SASS P 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I»Isoleucine, K«Lysine, 
L»Iieucine, M»Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


* 


* 

• 




TQTDFSS S PS TNEMFTLTTNGDLPRP I FI PNGMPNTWPCGTEK 
NFTNGMVNGHMPSLPDSPFTOYIIAVHRKMMRTELYFLSSQKNR 
PSliFGMPLIVPCTVHTRKKDLYDAVWIQVSRLAS PLPPQEASNH 
AQDCDDSMGYQYPFTLRWQECDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERWfeEHESVEQSRRAQ 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
LPP ILI IHLKRFQFVNGRWI KSQKI VKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSE PRI LAREVKKVDAGSSAGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICELADALSRGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCX5NGYSNGQLG 
NHS EEDSTDDQREDTR I KP I YNLYAI S CHSG I LGGGHYVTYAKN 
PNCKWYCYNDS S CKELHPDE I DTDSAY I LFYEQQG I DYAQFLPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHLYKLLVIGDLGVGKTSI IKRY 
VHQNFSSHYRATIGVDFALKVLHWDPETVVRIjOIjWDIAGQERFG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KPVSVVLIiANKCDQGKDVLMNNGLKMDQFCKEHGFVGWFETSAK 
ENINI DEASRCLVTCHILANECDIMES I EPDWKPHLTSTKVASC 
SG\ CAK I LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGS FGKPS P VTGLRAARRRRTRPSAPAAPSVG C 
G KRRE SDAGAGGERASVRTGSGRRGGRTMAGDSEQTLONHOQPN 
GGEP FLIGVSGGTASGKS S VCAKIVQLLGQNEVDYRQKQWI LS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNEL ILKTLKEITEG 
KTVQI PVYDFVSHSRKEBTVTVYPADWLFEGIIiAFYSQER/ IR 
DLFQMKLFVDTDADTRLSRRVLKDISERGRDLEQILSSSTLRFV 
KPA\FEEFCIiPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGGPS\NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 

• 


3180 


313 


QGISNFFHKEANFWFEVSGYLISPLRSPFVDPALEWSLMASPWN 
KMEGESSRFEIHTPVSDKKKKXCSIHKERPQKHSHEIFRDSSLV 
NEQSQ I TRRKKRKKDFQHL I S S PLKKS R I CDETANATS TLKKRK 
KRRYSALEVDEEAGVTVVLVDKENINNTPKHFRKDVDWCVDMS 
I EQKLPRK\ PKTDKFQVLAKSH\ AHKSEALHSKVREKKNKKHQR 
KAASWESQRA\RDTLPQSEFPTQBES WLS VGPGGEI TELP \ ASA 
HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
E VGADMQES \RPAVGLHGETAG I PAPAYKNKSKKKKKKSNHQEF 
EAVAMPESLESAYPEGSQVGSEVGWEG.STALKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VKSRPRQKKTQACIiAS KHVQEAPRLEPANEEHNVETAEDSE I RY 
LSADSGDADDSDADLGSAVKQLQEFIPNI KDRATSTI KRMYRDD 
LERFKEFKAQGVAIKFGKFSVKENKQLEKMVEDFLALTGIESAD 
KLL YTDR YPEEKS VI TNLKRRYS FRLHIG \RMIARPWKLI YYRA 
KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 
SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVILKKMSPQELK 
EVDSKLQENPESCLS IVREKLYKGISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I YYGMNALRAKVS L IERLYE INVEDTNEI 
DWEDLASAIGDVPPSYVQTKFSRLKAVYVPFWQKKTFPEIIDYL 
YETTLPIjLKEKLEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 
SEGGGHRKRKRRPRRHAWFTPVI PVLWEAKAGWII 


5413 


3753 


1304 


rfpagvaprramaiwskkvswsgrdrddeeaapllrrtarpggg 

TPLLNGAGPGAARQSPRSALFRVGHMSSVKIiDDELLEP\DMDPP 
HPFPKEI PHNEKLLSLKYESLDYDNSENQLFLEEERR INHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTEKGGLS FSLLLWATLNAAFVLVGSVI VAFI EPVAAGSGIPQ 
I KCFLNGVKI PHWRLKTLVI KVSGVI LS WGGLAVGKEGPMIH 
SGSVIAAGISQGRSTSLKRDFKIFEYLRRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTWRIFFASMISTFTLNF 
VLS I YHGNMWDLSS PGL INFGRFDS E KMAYT I HEI P VF I AMG W 
GG^GAVFNALNYWLTMFRIRYIHRPCLQVIEAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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1 SEQ 
ID 
| NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, 1=1 so leucine, KaLyame, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S WSLFHDPPGS YNPLTLGLFTLVYFFLACWTYGLTVSAGVF I P 
S LL I GAAWGRLFG I S LS YLTGAAI WADPGKYAIiMGAAAQIjG^.lv 
RMTLSLTVTMMEATSNVTYGFP IMLVLMTAK I VGDVFIEGLYDM 
HIQLQSVPFLHWEAPVTSHSLTAREVMSTPVTCLRRREKVGVIV 
DVLSDTASNHNGFPWEHADDTQPARLQGLI LRSQL IVLLKHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPPIQS IHVSQDERECTMD 
LS E FMNPS P YTVPQEAS L PRVFKLFRALGLRHLVWDNRNQ WG 
LVTRKDLARYRLGKRGLEEIiSIiAQT 


5414 


2130 


390 


GVASAWDRALFS PLLS PTSRVFRTSPPRCVSTETGRRDRARVPS 
QWCSVLQGKLPVSGRTSLACVRS ILLSPASSPRKVGIVGGTGAR 
AG7VAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRSP 
RSPRSRTRRGCSASPACLP/CRSALIVAVLCYINLLNYMDRFTV 
AGVIjPDIEQFFNIGDSSSGLIQTVFISSYMVLAPVFGYLGDRYN 
RKYLMCGGXAFWSLVTLGS SFI PGEHFWLLLLTRGLVGVGEASY 
STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 

_ .- , . — , m mm mm m *m mp mm. my mm, V **** V V V ^ T ^^^^ 1ft ^^^^ T /CO f ▼ ^5 ■ 

MAGDWHWAIiRVTPGLG WAVLLL FL WREP PRGAVEKHS DL P PIi 
NPTS WWADLRALARNPS FVLSSLGFTAVAFVTGSLALWAPAFLL 
RSRWLGETPP CLPGDSCS SSDSL I FGLITCLTGVLGVGLGVE I 
SRRLRHSNPRAD PkVCATGLLGSAP FLFLSLACARGS I VAT YI F 
IFIGETLLSMNWAIVADILLYWIPTRRSTAEAFQIVLSHLLGD 
AGS P YLIGLI SDRLRRNWP PSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKH\LTTLT\NQEQATIFEEVQKLRPRNEQRENEL 
IISFLRCLFEBKQKEHIHIGEMKQTSQMAAENIGSELPPSATRF 
RLDMLKNKAKRS LTESLES IIiSRGNKARGLQEHS I SVDLDSShS 
STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLSPQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 
RYHSVSTETPHERKDFES KANHLGDSGGTPVKTRRHSWRQQ IFL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTSRELRELWQKAILQQILLLRMEKENQKLQASENDLLNKR 
LKLDYEEIXPCLKEVTTVWEKMLSTPGRSKIKFDMEKMHSAVGQ 
GVP\RHHRGEIWKFliAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SQQHAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQE 
VGYCQGLSFVAGILLLHMSEEEAFKMLKFIjMFDMGLRKQYRPDM 
1 1 IiQ I QMYQLSRLLHD YHRDLYNHLE EHE IG P SLYAAP WFLTMF 
ASQFPIiGFVARVFDMIFLQGTEVTFKVALSLLGSHKPLILQHEN 
LETIVDFI KSTL PNLGLVQMEKTINQVFEMDIAKQLQAYEVE YH 
VLQEELIDSSPLSDNQRMDKLEKTNSSLRKQNLDLLEQLQVANG 
RIQSLEATIEKLLSSESKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


" KSQIiFCFWGGKAGDI LSGDQDKEQKDP YFVETP YGYQLDLDFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLASFGGMGTTSSLPSFVGSGNHNPAKHQLQNGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTI PVLQVKI SVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENMNDIVVYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKEIELQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQELQAAGSRKKVDKATMAQPIjVFSKVVEAVVQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPErjPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
kit ktt TfTTTTOCT^pr^nPQvnuTVr^PKFriAQRGVNTEAVSOVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDIKSSTKTRSIGVGTLLSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLIiAEQQ 
TLLAENYSEIJ\EAFGEPHSQMGSLNSQIiISTIjSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKOGGLQSGSPIiSSQTSQPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 

U«- LI-! ol- i Hi no T — T a r~^~\ on \ n IT— T atb. •i 

L= Leucine, M=Methionine, N»Asparagine , 

P«Proline, Q«Glutamine, R^Arginine, 

S -Serine, T=Threonine, V=Valine, 

W= Tryptophan, Y= Tyrosine, X= Unknown, *=StOp 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion) 








QEVGTSEG KP I SSLDAFPTQEGTIjSP VNLTDDQIAAGL YACTNN 
ESTLKS I M KKKDGNKDSNG AKRNLQFVG INGG YETTS S DDS S SD 
ESSSSESDDECDVIEYPLEBEEEEEDEDTRGMAEGHHAVNIEGL 

PKALTSKDMRFCLNTLQHEWFRVSSQKSAI PAMVGDYIAAFEAI 
S PDVLRYVINLADGNGNTALHYS VSHSNFE I VKLLLDADVCNVD 
HQNKAGYTPIMLAALAAVEAEKDMRIVEEIiFGCGDVNAKASQAG 
QTALMIiAVSHGRIDMVKGLLACGADVNI QDDEGSTALMCASEHG 
HVE IVKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKD IAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


5417 


27 

* 

« 

* 


4074 

• 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDIiDFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTS S LPSFVGSGNHNPAKHQLQNGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQIiVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGEIiYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENMNDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKEIELQQQTIESLKEKIYRIjEVQLRETTHDREMT 

klkqelqaagsrkkvdkatmaqplvfskvveawqtrdqmvgsh 
mdlvdtcvgtsvetnsvgiscqpeckejkvvgpelpmnwwivker 
vemhdrcagrsvemcdksvsvevsvcetgsnteesvndltllkt 
nlnlkevrs igcgdcs vdvtvcs pkecasrgvnteavsqveaav 
mavprtadqdtstdleqvhqftntetatliesctntclstldkq 
tstqtvetrtvavgegrvkdinsstktrsigvgtllsghsgfdr 
psavktkesgvgqinindnylvglkmrtiacgppqltvgltasr 
rsvgvgddpvgeslenpqpqaplgmmtgldhyieriqkllaeqq 
tllaenyselaeafgephsqmgslnsqlistlssinsvmksast 
eelrnpdfqktslgkitgsylgytckcgglqsgsplssqtsqpe 

QBVGTSEGKPISSLDAFPTQEGTLSPVNLTDDQIAAGIjYACTNN 

estlksimkkkdgnkdsngakknlqfvginggyettssddsssd 
essssesddecdvieypleeeebeededtrgmaeghhavniegl 

pkaltskdmrfclntlqhewfrvssqksaipamvgdyiaafeai 
spdvlryv inladgngntalhys vshsnfe ivkllldadvcnvd 
hqnkagytp i mlaalaaveaekdmr i veelfgcgdvnakasqag 
qtalmlavshgri dmvkgllacgadvniqddegstalmcasehg 

HVEIVXLI^QPGCNGHLEDNDGSTAIjSIALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGSFD 


5418 


24 


1133 


svpraggdmbtgaaelydqallgilqhvgnvqdflrvlfgflyr 
ktdfyrllrhpsdrmgfppgaaqaiivlqvfktfdhmarqddekr 

ROELEEK T RR KE EEEAKTVSAAAAEKE P VP VP VOE I E I D STTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EPPI 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVS VALSSS S I RVAMLEENGERVLMEGKLTHKINTES SLWSL 
EPGKCVLVNLSKVGEYWWNAILEGEEPIDIDK1NKERSMATVDE 
EEQAVIjDRLTFDYHQKLQGKPQSHE LKVHEMLKKGWDAEGS pfr 
GQRFDPAMFNIS PGAVQF 


5419 


1395 


259 

^ 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
FmPACT>pnc*/r.r5P /nr^7T3par\DQQwvr , QnrvT^MTTTiaiVNn?TV'B , TT. 

PQRIQQWQQS PC I AEEHGKKLLER I RREQQSARTRLQEMERRFH 
ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VALRKMERCYAKYESQTSFGSMYPTRIEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHS RDPKVPADE VCG CPLVRDVFELTGDFCRLP K 
RQCNRHYCWEKLRRAEVDLERVRVWYKLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NE AGGAC P FKGG ASGRL YLS PRIjPR VS VAGCEER P LGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 1 
nucleotide 
location 
corresponding 
to first | 
amino acid j 
residue of ! 
amino acid 
sequence 


Amino acid segment containing signal peptide | 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidme, I=IsoJLeucine, K^Lyame, j 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T«Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine , X=Unknown, *=Stop I 
Codon, /=possible nucleotide deletion, I 
\=possible nucleotide insertion) j 








ECU STLLFATLYI LCHI FLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFS I ISNEVLLSLPRNYYIQWLNGS J 
LIHGLWNLVFLFSNLSLI FLMPFAYFFTESEGFAGSRKGVLGRV 
YETVVMLMLLTLLVLGMVWVAS^ 

YLYS CI S FLGVLLLLVCTPLGLARMFS VTGKLliVKPRLiLEDJjEJi 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVIiALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAJHILELLID 
EAAMPRGMQGTSLGQVS FSKLGSFGAVIQVVLI FYLMVSS VVGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTU3L j 
TRFDLLGDFGR FNWLGNF Y I VFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE J 


5421 

> 1 


117 


1733 


"NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
ECIISTLIjFATLYILCHIFLTRFKKPAEFTT\GMMKMPPSTRL/ j 

llelctftlai algavlllpfs 1 1 snevllslprnyy iqwiings 
lihglwnlvflfsnlsliflmpfayfftesegfagsrkgvlgrv 
yet^a^lmlltllvlgmvwasaivbknkanreslydfweyylp 

YLYSCI SFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEIi 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVXiAIiQTQRVL 

i^krrkasawqrnlgyplamlcllvltglsvlivaihilellid 
eaamprgmqgtslgqvsfsklgsfgaviqvvlifylmvssvvgf 
yss plfrslrprwhdtamtq i igncvcllvlssalpvfsrtlgl 
trfdllgdfgrfnwlgnfyivflynaafaglttlclvktftaav 

raelirafgere J 


5422 


3 

1 i 


1263 


SCGESLPTWLAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF 
KAGCNWWHLSRDQAGVQRCDLGSSQPPPLGFKRFSCLSLPSSWD 
YRSTVLCVSKMEADLSGFNIDAPRTOQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWPPGTQVEQLLYAKJCbYDSAF 1 
HPDTGEKMNVIGRMSFQLPGGMIITGFMLQFYRTMPAVIFWQWV 

I NQSFNALVNYTNRNAASPTSVRQMALSYFTATTTAVATAVGMNM 
LTKKAPPLVGRWVPFAAVAAANCVNI PMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVTMERLEKLH 
FMQKVKVL / SAPLQVMLSGCFLI FMVP VACGLFPQKCELP VS YL 

j EPKLQDTIKAKYGELEPYVYFNKGL 1 


5423 


J 3186 


905 


GVSMALGE EKAEAEAS EDTKAQS YGRG S CRERELD I PGPMSGEQ 

P PRLEAEGGLIS PVWGAEG I PAPTCWIGTDPGGPSRAHQPQASD 

ANREPVAERSEPALSGLPPATMGSGDIiLSGESQVEKTKLSSSB 

EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 

LSCLSQWKSVLS PGSAAQPSSCS ISASSTGSSLQGHQERAEPRG j 

GSLAKVSSSLEPWPQEPSSWGIiGPRPQWSPQPVFSGGnASGL 

GRRRLS FQAEYWACVLPDSLPPS PDRHS PLWNPNKEYEDLLDYT 

YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 

TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIiASW 

SQLASTPRAPGSRDARWERREPALRGAKDRIiTIGKHLDMGSPQL 

RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDBY j 

LALPARLTQ VS S LVS YLGS I S TLVTL P rGD 1 Kfciy b P L& VbUblA? 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTSLK\SSLQLYRQFKKDIDEHQSLTESVLQKGEILLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDS ILASLDMLAGCTLI P 
j DKKPMAAMEHPCEGV J 


5424 


| 3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGS CRERELD I PGPMSGEQ 
PPRLEAEGGLISPTOGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
1 a\TOT?t>\7aPDQTTDaT.c:r3T.DD2iTMri^rinTjti*LSGESOVEKTKIjSSSE 1 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG | 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSIiAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 

GRRRLS FQAEYWACVLPDSLPPS PDRHS PLWNPNKEYEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 
j TNVS PNCPPAEATALPFSGPRE PSLKQW PSRVPQKQGGMGLASW | 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E» j 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lyeine, 
L=Leucine, M»Methionine, N^Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSSLVSYLGS ISTIiVTLPTGD I KGQSPLEVSDSDG 
PASPPSSSSQSQLPPGAALQGSGDPEGQNPCFIiRSFVRAHDSAG 
EGS LGS SQALGVSSGLLKTRPS LPARLDRWP FSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNIiTSLK\SSLQLYRQFKKDIDEHQSLTESVLQKGEILLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDSIliASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPSPSLGHQPPRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NSYWRVSTVHGNVITTNTIFENLWFSCATDSIiGVYNCWEFPSML 
ALSGYIQACRAIjMITAILLGFLGLIjUSIAGLRCTNIGGLELSRK 

aklaatagaphX ilpgicgmvai \s wyafnitr\dfsdplypgt 
kyelgpalylgwsaslisilgglclcsacccgsdedpaasarrp 
yqapvsvmpvatsdqegdssfgkygrnalrvaalcrgprclpta 
pkkrgpgrgpfpysnlrgrprpvpvapprprprvlhshgpsqak 
ncswevaylpseagsli F 


5426 

* 


42 


3435 


atssqslgradpprggtmerspgegpspspmdqpsapsdptdqp 
paahakpdpgsggqpagpgaagealaviitsfgrrijlvli pvyla 
gavglsvgfvlfglialylgwrrvrdekerslraarqllddeeql 
taktlymshrelpawvsfpdvekaewlnkivaqvwpflgqymek 
llaetvapavrgsnphlqtftftrvelgekplri igvkvhpgqr 
keq i lldlni s yvgdvq i dvevkkyfckagvkgmqlhgvlrvil 

EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTinjLDIPGtiSSLSD 
TMI MDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IR IHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHE VPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHIiRLEWLSLLSDAEKLEQVLQWNWG 
VSSRPDPPSAAI LWYLDRAQDLPMVTS ELYP PQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQBLDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVWRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVtiRIHVLEAQDLIAKDRFLGGIiVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFI^RCKVRLTTVLNSGFIiDEWLTLBDVPSGRLHIiRL 
ERLTPRPTAAELEEVLQVNS LIQTQKSAELAAALLS I YMERAED 
LPLRKGTKHLSPYATLTVGDSSHKTKT I SQTSAPVWDESASFI*I 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAOLGILVSQHSGVEAHSHSYSHSSSSLSEEPEIiS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLS PE FNERFE WELPLDEAQRRKLDVSVKSNSS FMSREREL 
LGKVQIjDIjAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 

• 


ATSSQSXjGRADPPRGGTMERS PGEGPS PS PMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKERSIiRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRI IGVKVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGI*SSLSD 
TMIMDS I AAFLVLPNRLLVPLVPDLQDVAQLRS PLPRGI IRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHLRLEWLSLLSDAEKLEQVLQWNWG 
VSSRPDPPSAAILVVYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPBLILDQWFQLSSSGPNSRLYM 
KLVMRI LYLDSSE ICFPTVPGCPGAWDVDSENPQRGSS VDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFLGRCKVRLiTTVLNSGFLDEWLTLEDVPSGRLHLRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K» Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon t /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLSIYMERAED 

riPLRKGTKHLSPYATLTVGDSSHKTKTISQTSAPVWDESASFLI 

RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 

SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 

GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 

ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 

KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 1 

LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


SSRSERLSACAIAPPWLVSSRPARPAQLQRPGKMVEDGAEELED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV i 
LAASI PYFHAMFTNDMMECKQDB IVMQGMDPSALEALINFAYNG 
NLAIDQQNVQSLLMGAS FLQLQS IKDACCTFLRERLHPKNCLGV 
RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELNVKSEEQVFEAALAWVRYDREQRGTFL\RNLQSNIRLL 
FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 
AFRTRPRCCTS I AGLI YAVGGLNS AGDSLNVVEVFDP IANCWER 
CRPMTTARSRVGVAWNGLLYAIGGYDGQLRLSTVQAYNTETDT 
WTRVGSMNS KRS AMGT WLDGQ I YVCGGYDGNS S LSSVET YS PE 
TDKWTVVTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YNHHTATWHPAAGMLNKRCRHGAAS LGSKMFVCGGYDGSGFLS I 
AEMYS S V\ADQWCLI VPM\ HTRR \SRVSLGGPAVGRLYAVWGVT 
TGQSNL\SSVGDVLTPETDCWTFM\APMACHEGGVGVGCIPLLT 

I 


5429 


828 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISLSDGEEPPPYQGPCTLQ 
LRDPEQQLELNRESVRAPPNRTIFDSDLMDSARLGGPCPPSSNS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 


; 5430 


441 


1507 


QKRRKRRRKKIMKTIQPKMHNSISWAIFTGLAALCLFQGVPVRS 
GDATFPKAMDNVTVRQGESATIjRCTIDNRVTRVAWLNRSTILYA 
GNDKWCLDPRWLLSNTQTQYSIEIQNVDVYDEGPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDIS1NEGNNISLTCIATGRPEP 
TVTWRHISPKAVGFVSEDEYLEIQGITREQSGDYECSASNDV\A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI /EGKKGVKVENRPFLSKLIFFNVSEHDYGNYT 
CVASNKLGHTNAS IMLFGPGAVSEVSNGTSRRAGCVWLLPLLVL 
HLLLKF 


5431 


2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 
GL I MARKL I HLE IKPAI RNQ 1 1 RELQVLHE CNS P Y I VGFYGAF Y 
SDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQIMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS I SPRPRPPGRPVSGHGMDSRP 
AMAI FELLDY I VNBP PPKLPNGVFTPDFQEFVNKCLI KNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5432 


2 


1312 

• 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRS PAWRARSKP V \ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFER IS ELGAGNGGWTKVQHRPS 
GL I MARKL I HLE I KP A I RNQ 1 1 RE IjQVLHE CNS P Y I VGFYGAF Y 
SDGEISI CMEHMDGGSLDQVLKE AKRI PEE ILGKVS I AVLRGLA 
YLRE KHQ I MHRD VKPSN I LVNS RG E I KLCD FG VSGQL I DS MANS 
F VGTRS YMAPERLQGTH YSVQSD I WSMGLbLVfc IjAVCjK i Fl FJ^F 
DAKELEAI FGRPWDGEEGEPHS I SPRPRPPGRPVSGHGMDSRP 
AMAI FELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLI KNPAERA 
DLKMLTNHTFlKRSEVEEVDFAGWIiCKTLRLNQPGTPTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
LFGWPSLVFVFKNEDYFKDLCGPDAGPIGNATGQADCKAQDERF 
SLIFTLGS FMNNFMTFPTGYI FDRFKTTVARLIAI FFYTTATLI 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleofcide 
xotduion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o r r e sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


| Amino acid segment containing signal peptide 
J (A=Alanine, OCyBteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHn.stidi.ne, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine t ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
j \=possible nucleotide insertion) 








IAFTSAGSAVIJjFLAMPMLTIGGILFIjITNLQIGNLFGQHRSTI 
j I TLYNGAFDS SSAVFLI I KLL YE KGISLR/ VLLHLHLCLQ YLAC 
S THFPPDAPGAHPI PTAPQLQLWPVPWEWHHKGREG /QQLSMKT 
GSYSQRSSFQRRKRPQGQGRSRNSAPSGATL/CSRRFAWHLVWL 
SVIQLWHYLF 1GTLNSLLTNMAGGDMARVS TYTNAFAFTQFGVL 
CAPWNGLLMDRLKQKYQKEARKTGSST^VALCSTVPSLALTSIi 
LCLGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTLAFP 
I SEHFGKIiFGLVMALSAWSLI^FPIFTLIKGSi^NDPFYVNVMF 
J MLAILLTFFHPFLVYRECRTWKES PSAXA 


5434 


66 


652 


RYAAIiI I S L I QHKLLWRNQHCS RCV IMSPAQSAGLNWLF /GSGK 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDETAITCPQCRTGHLVQ 
RRSRYGKTFHS CDRYPECQFAINFKPI AGBCPECHYPLLI EKKT 
j AQGVKHFCASKQCGKPVSAE 


5435 


4704 

* 

• 


1597 


PGDSSQRLAEMSNAKERKHAKKMRNQPIOTTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQE I PKYI TASTFAQARAAEISAMLKAV 
TQKSSNSLVFQTLPRHMRRRAMSHNVKRLPRRLQEIAQKEAEKA 
VHQKKEHSKNKCmARRCTMNRTLEFNRRQKKNI WLETHIWHAK 
RFHMVKKWGYCjVSERPTVKSHRACYRAMTNRCLLQDLSYYCCLE 
LKGKEEE ILKALSGMCNI DTGLTFAAVHCLSGKRQGSLVL YRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKP I KKI IGDGTRDPCLP YSW I S PTTGI I ISDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
1 DSVS^CRQEAIFELLGGITSPAEIPAGTILGLTVGDPRINLPQ 
j KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHSFI WNQDI CKSV 
TENKI SDQDLNRMRSELLVPGSQLI LGPHESKI P I LLIQQ PGKV 
TGEDRLGWGSGWDVLLPKGWGMAFWI PFI YRGVRVGGIiKBSAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 
RRSEVPCAPMPKKTHQPSDEVGTS I EHPREAEBVMDAGCQESAG 
PERITDQElASEiniVAATGSHLCVLRSRKLIjKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLS ILGHFPRALVWVSLSLLSKGSPE 
PHTMI CVPAKEDFLQLHEDWHYCGPQES KHSDPFRSKILKQKEK 
KKREKRQKP \GRAS SDGPAGEEPVAGQEALTLGLWSGPLPRVTIj 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTOLLDMLSSQPAAQ 
RGLVLLRPPASLQYRFARIAIEV 


5436 


1781 


635 1 


ASDS I PWSEARTTRKLAQRGCQWSLPERMPLWFCX5LPYSGKSR 
RAEELRVALAAEGRAVYWDDAAVLGAED PAVYGDSAREKALRG 
ALRASVERRLSRHDVVILDSLNYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGROTSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWNGSAQADVPKELEREESGAAESPALVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQtiDQVTS 
QVXiAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAEIjSRLRR 
QFISYTKMHPNNENIjPQLANMFLQYLSQSLH 


5437 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 

PRRVDSSSENSGSDWDSAPETMEDVGHPKTi<DSGAIjRVSRAASE 

PSKEEPQVEQLGSKRMDSLKV7DQPISSTQESGRLEAGGASPKLR 

WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWLR 

GEPGAPSRYLGGPEECLQISTNLTLHLLELLASALLALCSRPLR 

AALDTLGLRGPLGLWLHGLLS FLAALHGLHAVLSLLTAHPLHFA 

CLFGLLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
if fit. 


5438 


2443 


1152 

* 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPiMMCQSEARQGPBLRAAKWLHFPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
. DLHHYRNLSEFFRRKLKPQARP VCGLHS VIS PSDGR I LNFGQVK 
N CE VEQ VKGVTYS LE S FLG PRMCTEDLP F P P AAS CDS FKNQLVT 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWWSHRRHFPGSIjMSVNP 

g mar wi ke l fchnervvltgdwkhgffs ltavgat \nwgs iriy 
fdrdlhtnsprhskgstodfsfvthtnregvpmalrgehlg/qs 
fnlgstivlifeapkdfnfqlktgqkirfgealgsl 


5439 


2443 


1152 


tkprkrrhqpasqrqrpwssdstgdliargkgrkeenkgsdrvs 

LAPPSLRRPMMCQSEARCKSPELRAAKWLHFPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPIiSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRP VYS L Y I WTFG VNMKEAAVE 
DLHHYRNLSEFFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
REGNELiYHCV I YLAPGDYHCFHS PTDWTVSHRRHFP GSLMSTOP 
GMARWI KELFCHNERVVLTGDWKHGFFSLTAVGAT\NWGS IRIY 
FDRDIJrrNSPRHSKGSYNDFSFVTHTNREGVPMALRGEHLG/ QS 
FNLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EPIPVTPDHRLVTMTHIV\QTFSPVNS\GQPPNYEMLKEEQEVA 
MLGAPHNPAPPMSTVIHIRSETSVPDHVVWSLFNTIjFMNTCCLG 
FIAFAYS VKSRDRKMVGDVTGAQAYASTAKCLNI WAL 1 1/3 1 FMT 
ILLIIIPVLVVQAQR 


5441 


2 


2054 


CRDGGKNGFMVSPMKPLE I KTQCSGPRMDPKICPADPAFFS FIN 
NSDLWANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKE 
LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLli 
PPALFIPSTENEEQ\RLASARAVPRNVQPYWYEEVTNWINVH 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YEAAG E I VRLTTPGFS HS CSMS QNFDMFVS HYSSVS TP P CVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQ PGKKHPTVLFVYGGPQVQLVNNS FKG IKYIiRLNTLASLGY 
AWVIDGRGSCQRGLRFEGALKNQMGQVEI EDQVEGLQFVAEKY 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTGYTERYMDVPENNQHG YEAG S VALHVEKL PNE PNRLLILH 
GFLDENVHFFHTNFLVSQLIRAGKPYQLQVALPPVSPQIYPNER 
HS IRCPESGEHYEVTLLHFLQEYL 


5442 


1 


3474 


CGQRSRRRS PDMPEAKPAAKKAP KGKDAPKGAPKEAP PKEAPAE 

APKEAPPEDQSPTAEEPTGVFLKKPDSVSVETGKDAWVAKVNG 

KELPDKPTIKWFKGKWLELGSKSGARFSFKESHNSASN^TVEI. 

HIGKWLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 

ESFKRTSEKKSDTAGELDFSGLLKKREWEEEKKKKKKDDDDLG 

IPPEIWELLKGAKKSEYEKIAFQYGITDLRGMLKRLKKAKVEVK 

KS AAFTKKLDPAYQVDRGNK I KLMVE ISDPDLTL KWFKNGQE I K 

P S S KYVFENVGKKR I LT INKCTLADDAAYEVAVKDEKCFTELFV 

KEPPVLIVTPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEL 

TREDSFKARYRFKKDGKRHILIFSDWQEDRGRYQVITNGGQCE 

AELIVEEKQLEVLQDIADLTVKASEQAVFKCEVSDEKVTGKWYK 

NGVEVRPS KRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 

GSLSAKLNFLEIKVEYVPKQ\EPPKI PLGFASGGKTSENAD/ IV 

WAGNKLRLDV\ S I TGE APS PFAT \ W LKG\ DEVFTTTEGRTRI E 

KRVDCSS FVIESAQREDEGRYTI KVTNPIGBDVAS I FLQ WDVP 

DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 

RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 

TKPFMPIAPTSEPLHLIVEDVTDTTTTLKWRPPNRIGAGGIDGY 

LVEYCLEGSEEWVPANTEPVERCGFTVKNLPTGARIIiFRWGVN 

IAGRSEPATLAQPVTIREIAEPPKIRLPRHLRQTYIRKVGEQLN 

LWPFQGKPRPQWWTKGGAPLDTSRVHVRTSDFDTVFFVRQAA 

RSDSGE YELS VQ I ENMKDTATIR IRWEKAGPP INVMVKEVWGT 

NALVE WQAPKDDGNSE IMG YFVQKADKKTMEWFNVYERNRHTS C 

TVS DIj I VGNE YY FRVYTEN I CGLS DS PGVS KNTAR I LKTG ITFK 

PFE YKEHDFRMAPKFLTPL IDRVWAGYSAALNCAVRGHPKP KV 

VWMKNKME I REDPKFL I TNYQG VL TLN I RRPS P FDAGTYTCRAV 

NELGEALAECKLEVRVPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPRRSRSAAEPA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFSVTTVDLKRKPADLQNLAPGTHPPFITFNSEVKTDV 
NKIEEFLBEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAYIKNS 
RPEANEALERGLLKTLQKIiDEYIiNSPLPDEIDENSMEDIKFSTR 
KFLDGNEMTLADCNLLPKLHIVKWAKKYRNFDIPKEMTGIWRY 
LTNAYSRDE FTNTCPSDKEVE I \ AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 

• 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDILRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRMPQEDERPADEYDQPWEWKKDHISRAFAVQFDSPEWERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESIiL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPSVPEfiVLHYSSRPLPVQGAEHLALLYPWTQTP*Q 
* PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILSRGFLGSVEICIQIiPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVCPWWX*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEE SE P PAPN I RNMAPNSLS APTMIiHNSSGDFSQAHSTLKLANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASES WGALPAEHQFS FMEKRNQWLVSQLSAAS PDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDIiPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLP PNIjSPHAP WNYHYHCPGS PDHQVPYGHD YPRAAYQQVTQP 
ALPGQPLPGASVRGMPVQKVILNYPSPWDQEERPAQRDCSFFG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDIFEDRIRGlDIIKWMERYIiRDKTVMriVAISPKYKQ 
DVEGAESQLDEDEHGLHTKYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRLWGIiLWMLFVSEIjRAATKLTEEKYELKEGQ 
TLDVKCDYTLEKFASSQKAWQIIRDGEMPKTLACTERPSKNSHP 
VQVGRIILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQPPKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKIPPTTTKALCPLYT 
TPRTVTQAPPKS TADVSTPDS EINLTNVTDIIRVP VFN1 VI LLA 
GG FLS KS LVFS VLFAVTLRS FVP * AHE PTRMS SDFQPHPSGS CA 
KGGGRR 


5447 


207 


617 


MTARTLSLMASLVAYDDSDSEAETEHAGSFNATGQQKDTSGVAR 
PPGQDFASGTLDVPKAGAQPTKHGSCEDPGGYRLPLAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCWPY 
TPRRLRQRQALSTETOKGKDVEPQGPPAGRAPAPLYVGPGVSEF 
IQPYLNSHYKETTVPRKVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMDKTFKYWNAVDSGHCLQTYSLHTEAVRAARWAPCGRRIL 
SGGFDFALHLTDLETGTQLFSGRSDFRITTLKFHPKDHNIFIiCG 
GFSSEMKAWDIRTGFCVMRSYKATIQQTLDILFLREGSEFLSSTD 
ASTRD S ADRT I IAWDFRTSAKISNQI FHERFTCPSLALHPREPV 
FLAQTNGNYLALFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTGSADGRVLMYS FRTASRACTLQGHTQACVGTTYHPVLP 
SVLATCSWGGDMKIWH*AFHWLSLGEAIGDLAPARGYSGPGRSL 
KSPSPSKSLLVLLCGRAMFQPATCPWQIjPALSK 


5448 


194 


1833 


MAS KVTDAI VW YQKKIGAYDQQI WEKS VEQRE I KGLRNKPKKTA 
HVKPDIilDVDIiVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTSKVI FFWLLVLYLLQVAAI VLFCSTSSPHS IPLTEVI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GS KKAKNS IDKSTETDNG YVS LDGKKTVKSGEDG IQNHEPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=HjLStxdzne, l=lsojLeucine, K.=jjysxne, 
L«Leucine, M=Methionine , N«=Asparagine, 
P«Proline, Q=*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X*=unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPS KDTQRT I TNVSDE VS S BEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTS CSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRIiSQA 
TDLEQLTAHS ASELYVI AFGSNEDVI YLSMVI ISrv VKV o Jj v w j. 
FF FLLCVAERT YKQVG I M * TS EGVLRKRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQ INP CVKKE YRDDPFHQSHLPWLHS SHPGLEKI SAI 
VWEGNDCKKADMSVLEISGMIMNRWSHIPGIGYQIFGNAVSLI 
LGLTP FVFRLSQATDLEQLTAHSASELYVIAFGSNEDVI VLSMV 
IIS FWRVSLVWI FFFIiLCVAERTYKQVGIM 


5449 


194 

m 


1833 


' MAS KVTDAI VW YQ KKIGAYDQQ I WEKS VEQRE I KGLRN KP KKTA 
HVKPDLIDVDLVRGSAFAKAKPES PWTSLTTKG I VRWFFPFFF 
RWWLQVTSKVI FFWLLVLYLLQVAAIVLFCSTSSPHS IPLTEVI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKIjRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNS IDKS TE TDNG YVS LDGKKTVKSGEDG IQNHEPQCET 
IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDIiLHCAEOISSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELWlAFGSNEDVIvijSMVIi.br vvKvsiiVWi 
FFFLLCVAERTYKQVGIM * TSEGVLRl^KSHHYKKHYPNEDAPK 
SGTSCSSRCSS SRQDSES ARPESETEDVLWEDLLHCAECHSSCT 
S ETD VENHQ INP CVKKE YRDD P FHQSHLP WLHS S HPGLEKI SAI 
VWEGNDCKKADMSVTjEISGMIMNRVNSHIPGIGYQIFGNAVSLI 
LGLTP FVFRLSQATDLEQLTAHSAS EL YVIAFGSNEDVIVLSMV 
I ISFWRVSLVWIFFFLLCVAERTYKQVGIM 


5450 

1 

i * 


B136 


1242 


GQQFAS FFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLLAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDSIPHTVV 
LTWEG WATLS VDG FLNAS SAVPGAPLE VPYGLFVGGTGTLGLP 
YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 
RRGDF I Y VD I FEGHLRAWE KGQGTVLLHNSVPVADGQPHEVS V 
HINAHRLE I S TOQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLGLTPEATNASLLGCMEDLSWGQRRGLREALLTRNMA 
AGCRLEEEEYEDDAYGHYEAFSTLAPEAWPAMELPEPCVPEPGL 
PPVFANFTQLLTIS PLWAEGGTAWLEWRHVQPTLDLMEAELRK 
SQVLFSVTRGAHYGEIjELDILGAQARKMFTLLDVWRBCARFIHD 
GSEDTSDQLV^EVSVTARVPMPSO^RGQTYIjLPIQvNPvOTPP 
H 1 1 FPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VSDGLQASPPATLKWAIRPAIQIHRSTGLRLAQGSAMPILPAN 
LSVETNAVGQDVSVLFRVTGALQFGELQKHSTGGVEGAEWWATQ 
AFHQRDVEQGRWYLSTDPQHHAYDTVENLAIjEVQVGQEILSNL 

sfpvtiqrattomlrleplhtqntqqetlttahleatleeagps 
pptfhyewqaprkgnlqlqgtrlsdgqgftqddiqagrvtyga 
taraseavedtfrfrvtapp yfs plytfp ihiggdpdapvltnv 
llvtoeggegvlsadhlfwslnsasylyevmerprlgrlawrg 
tqdkttmvtsftnedllrgrlvyqhddsetteddipfvatrqge 
ssgdmaweevrgvfrvaiqpvndhapvqtisrifhvarggrrll 
ttddvafs dadsgfadaqlvltrkdllfgs i vavdeptrpi yrf 

Tnwm P KT? R VT.FVHSGADRGW IOLOVSDGOHQATALLEVQASEP 
YLRVANGSSLWPQGGQGTIDTAVLHLDTNLDIRSGDEVHYHVT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLSPEDTMAF 
S VEAGP VHTDATLQ VT I ALEG P LAP LKL VRHKKI YVFQG EAAE I 
RRDQLEAAQEAVP PAD I VFS VKS P PSAGYLVMVSRGALADE P PS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 
GVLVELEVL P AAI PLE AQNFS VPEGGSLTLAPP LLRVSG P YF PT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

1 ^Tt "* /%n 
lOCdLlOn 

corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponamy 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
n-nisc idine , j,aj.Boieucine, R^i^ysins, 
L«Leucine, M=Methionine , N«Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






9 

• 


LLGIiSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRYV 
HDGSETLTDSPVLMANASEMDRQSHPVAFTVTVLPVNDQPPILT 
TNTGLQMWEGATAP I PAEALRSTDGDSGSBDLVYTI EQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TS PGHFFRVTAQKQVLIiS LKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLLLYRWRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GNILYEHEMPPEPFWEAHDTLELQLS S PPARDVAATLAVAVSFE 
AACPQRPSHL WKNKGLWVPEGQRARITVAALDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQL.LVSEEPLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDGFHFRAHLQGPAGASVAGPQTSEAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 
RAPHNGFLSLVGGGLGPVTRFTQADVDSGRLAFVANGSSVAGIF 
QLSMSDGASPPLPMSLAVDILPSAIEVQLRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRLICX3PQYGHLLVGGRPTSAFSQFQI 
DQGE WFAFTNFS SSHDHFRVLALARGVNASAVVNVTVRALLHV 
WAVjOPWFyviA 1 JjKbUJr' I V JjUASja JjAJNRTvjSVPRFRJjJjEGPRHGR 
VVRVPRARTEPGGSQLVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDSLTLELWAQGVPPAVASLDFATEPYNAARPYSVALLSVPEA 
ARTEAGKPESSTPTGEPGPMASSPEPAVAKGGFIiSFLEAWMFSV 
1 1 PMCLVLLLLAL I LPLL FYLiRKRNKTG KHDVQVLTAKPRNGLA 
GDTETFRKVEPGQAIPLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


5451 

• 

T 


1 

: 


2274 

• 


RDSSEQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNSEPGSPHSLEALRDAAPSQGLNFLLLFTKMLFIFNFLFSPLP 
TPAL I CI LTFGAAI FLWI>I TRPQPVLPLLDLNNQSVGIEGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGIiAVSDNGPCLGYRKPNQ 
P YRWLS YKQVSDRAE YLGSCLLHKGYKS SPDQFVG I FAQNRPE W 
IISELACYTYSMVAVPLYDTl^PEAIVHITOKADIAMVICDTPQ 
KALVLIGNVEKGFTPSLKVI I LMDPFDDDLKQRGE KSGI E ILS L 
YDAENLGKEHFRKPVPPS PEDLS VICFTSGTOGDPKGAMITHQN 
IVSNAAAFLKCVEHAYEPTPDDVAISYLPLAHMFERIVQAWYS 
CGARVGFFQGDIRLLADDMKTLKPTLFPAVPRLLNRIYDKVQNE 
AKTPI/KKFLLKLAVSSKFKELQKGIIRHDSFWDKLIFAKIQDSL 
GGRVRVI VTGAAPMSTS VMT FFRAAMG CQ VYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPLACNYVKLEDVADMNYFTVNNEGEVCI KG 
TOTFKGYLKDPEICTQEAIiDSDGWLHTGDIGRWLPNGTLKIIDRK 
KNIFKLAQGEYIAPEKIENIYNRSQPVLQIFVHGESLRSSLVGV 
VVPDTDVLPSFAAKLGVKGSFEELCQNQVVREAILEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGEIiSKYFRTQ 

lUoJu X £1X1 -L^U 


5452 


1833 


1138 


SRVPSLCLSLSIiSIiSPSREPVAGAPGCGrAGPPAMATIiWGGIiLR 
LGSLLSLSCLALSVLIiLAQLSDAAKNFEDVRCKCICPPYKENSG 
HI YNKNI SQKDCDCLHWE PMP VRGPDVEAYCLRCECKYEERSS 
VTIKVTIIIYLSIU3LtiLYMVYLTLVEPILKRRLFGHAQLIQS 
DDDIGDHQPFANAHDVLARSRSRANVI^KVEYAQQRWKLQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PS I PAAVPQSAPPEPHREETVTATATS QVAQQPPAAAAPGBQAV 

AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 

PQEERSQQQDDIEELETKAVGMSNEX3RFLKFDIEIGRGSFKTVY 

KGl^TETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 

VKriDSWEb aVKGKJMJIVLVTEIjMTSGTIjK^ 

RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGD 

LGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 

LEMATS E YP YSECQNAAQI YRRVTSGVKPAS FDKVAI PEVKE 1 1 

EGCIRONKDERYSIKDIjr.NHAFFORFTGVRVEriAEEDDGEKIAI 

KL WLR I ED I KKLKGKYKDNE AI E FS FDLERNVPEDVAQEMVE SG 

YVCEGDHKTMAKAIKDRVSLIKRKREQRQL* 


5454 


111 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEEIiETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGL1DTETTVEVAWCEI1QDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidme, I=»Isoleucme, K= Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKvM 

RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGD 

IX3LATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 

LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 

BGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAEEDDGEKIAI 

KLWLR I E D I KKLKGKYKDNEA I E FS FDLERNVPEDVAQEMVE S G 

YVCEGDHKTMAKAIKDR VS LIKRKREQRQL * 


5455 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAILPLLFGCIiGVFGLFRLLQ 
WVRGKAYLRNAVWI TG ATSGLGKECAKV F YAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
■QCFGYVDILVNNAGISYRGTIMDTTVDVDKRVMETNYFGPVALT 
KALLPSMI KRRQGHI VAI SSIQGKMS IPFRSAYAASKHATQAFF 
DCLRAEMEQYE I EVTV I S PG Y IHTNLSVNAITADGSRYGVMDTT 
TAQGRS PVEVAQD VLAAVGKKKKDVI LADLLP S IiAVYLRTLAPG 
LFFSLMASRARKERKS KNS 


5456 


2 

* 


2332 


CGAGLVAAGAVLVLYPASRAGERTRVPGS PAPS SLPLHS PGACG 
TEVDMDPQRSPLLEVKGNIELKRPLIKAPSQLPLSGSRIjKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
QKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGG 
KKPSKRPAWDIiKGQLCDLNAELKRCRERTQTLDQENQQLQDQLR 
DAQQQVKALGTERTTLEGHIJVKVQAQABQGQQELKNLRACVLEL 
EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRLHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTJjSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDGYPVCI FAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRHLFS VAQELSGQGWTYS FVAS YVE IYNETVRDLLATGTRKG 
QGGE CE I RRAG PGSEE LTVTNARYVP VS CEKEVDALLHLARQNR 
AVARTAQNERSSRSHSVFQLQISGEHSSRGLQCGAPLSLVDLAG 
SERLDPGLALGPGERERLRETQAINSSIiSTLGLVIMALSNKESH 
VP YRNS KLT YLLQNSLGG S AKMLM FVN I S PLEENVS ESLNS LRF 
ASKVE PS VLFGTAQSNRKWKTDPDLCVCVCVCVCVCVCVCV CVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


5457 


2 


1540 


DDFVE RRRWTRTTCLVRS P PHVPVCGHACS WNGG S LDPLKGT PA 
LLRS AERLMRKVKKLRLDKENTGSWRS F S LNS EGAERMATTGTP 
TADRGDAAATDD PAARFQVQKHSWDGLRSI IHGSRKYSGL1 VNK 
APHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLLYSEIPKKV 
RKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGI 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PIiEI KTQCSGPRMDPKICPAOPAFFS FINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFI PSTENEEQA 
ASLCQS CPQECPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWLRAEV 

KRIiSHELAETTREKIQAAEYGIiAVLEEKHQLKLQFEELEVDYEA 

! IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 

VRKVLELQTELKQLRNVLTNTQS ENERI*AS VAQELKE INQNVE I 

QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 

VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISERQLEEALET 

LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKFSDDAA 

EPNNDAEALVNGFEHGGLAKLPLDNKTSTPKKEGLAPPSPSLVS 

DLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 

e>r cc^rxirinT^pnT tdmi pt\t dot nACVPROTIU\nNEK"DRDSHEDG 
SliSEQQCiix V IKLii bWijoAijKKijWA&lVii-K-W L*\uiJv*ttxujr^LJiDaiui-Ju 

DYYEVD INGPE ILACKYHVAVAEAGELREQIiKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRQDRELLARLEKELKKVS 
DVAGETQGSLSVAQDELVTFSEEIjANLYHHVCMCNNETPNRVML 
D YYREGQGGAGRTS PGGRTS PEARGRRS P I XjLPKGIiLAPEAGRA 

dggtgdsspspgsslpsplsdprrepmniynliaiirdqikhlq 
aavdrttelsrqriasqelgpavdkdkealmeeilklksllstk 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=S erine , ^Threonine , V=Val ine , 
W«Tryptophan, Y=Tyrosine / X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






* 


REQITTLRTVLKANKQTAEVALANLKSKYENEKAMVTETMMKLR 
NELKALKEDAATFS SLRAMPATRCDEY I TQLDEMQRQLAAAEDE 

kktlnsllriviaj:qqklaltqrlelleldheqtrrgrakaapktk 
patpsvsktcacasdraegtglanqvfcsekhsiycd 


5459 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRIjGIYQRCWLVFKKASS 

kgpkrlekfsderaayfrcyhkvtelnnvtcnvarlpkstkkhai 
giyfnddtsktfacesdleadewckvlqmecvgtrindislgep 

DLLATGVBREQSERFNVYLMPSPNIjGCYMGECALQITYEYICLW 

dvqnprvkliswplsalrrygrdttwftfeagrmcetgeglfif 
qtrdgeaiyqkvhsaaiaiaeqherllqsvknsmlqmkmseraa 
slstmvplprsaywqhitrqhstgqlyrlqdvssplklhrtetf 
payrseh 


5460 


45 

« 


2097 

i 


rpgcragelstgsrarervrnrvsapcgqdsrrcdpevlrgrsp 
glglaempscgactcgaaavrlitsslasaqrgisggrihmsvl 
grlgtfetqilqraplrsftetpayfaskdgiskdgsgdgnkks 
asegsskksgsgnsgkggnqlrcpkcgdlcthvetfvsstrfvk 
cekchhffvvlseadskksiikepesaaeavklafqqkpppppk 
k i ynyldk ywgqs fakkvls va vynh ykr i ynni panlrqqae 

vekqtsltpreleirrredeyrftkllqiag1sphgnalgasmq 
qqvnqqipqekrggevldsshddi kleksni lllgptgsgktll 
aqtlakcldvp fai cdcttltqag yvgedi esviakllqdanyn 
vekaqqgivfldevdkigsvpgihqlrdvggegvqqgllklleg 
tivnvpeknsrklrgetvqvdttnilfvasgafngldriisrrk 
nekylgfgtpsnlgkgrraaaaadlanrsgesnthqdieekdrl 
lrhveardli e fgmipe fvgrlpvwplhsldektiivqiltepr 
navi pq yqalfsmdkcelnvtedalkaiarlialerktgarglrs 
imeklliiepmfevpnsdivcvevdkewegkkepgyiraptkes 
seeeydsgveeegwprqadaans 


5461 

■ 


1481 

* ■ ' * 


160 


inpppppkspcgrarkwrrrrrpgapeaavmelpsgpgperiifd 
shrlpgdcflllvlllyapvgfcllvlrlflg ihvflvs calpd 
s vlrrfwrtmcavlglvarqedsglrdhs vrvli snhvtp fdh 
nivnllttcstpllnsppsfvcwsrgfmemngrgelveslkrfc 
astrlpptplllfpeeeatngregllrfsswpfsiqdvvqpltl 
qvqrplvsvtvsdaswvsellwslfvpftvyqvrwlrpvhrqlg! 
eanee falr vqqlvake lgqtgtrltpadkaehmkrqrhprlrp 
qs&qss fpps pqpspdvqlatlaqrvkevlphvplgviqrdlak 
tgcvdltitnllegavafmpeditkgtqslptasaskfpssgpv 
tpqptaltfaks s warq es lqerkqal yeyarrrfterraqead 


5462 


6£3 


3353 


KIKERQMSANNSP PSAQ KS VL PTA I PAVLPAAS PCSS PKTGLSA 
RLSNGS FSAPSLTNSRGS VHTVSFLLQI GLTRES VT I EAQELS L 
SAVKDLVCSIVYQKFPECGFFGMYDKII/LFRHDMNSENILQLIT 
SADE IHEGDLVEWLS ALATVEDFQ I RPHTLYVHS YKAPTFCDY 
CGEMLmLVRQGhKCEGCGLNYHKRCAFKIPmCSGVRKRRLSU 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHS YTRPT I CQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
S PSTSKNI PLMRWQS I KHTKRKSS TM VXEGWMVHYTSRDNLRK 
RHYWRLDSKCLTLFQNESGS KYYKE I PLSEILR1 S S PRDFTNIS 
QGSNPHCFE I ITDTMVYFVGENNGDSSHNPVLAATGVGLDVAQS 
WEKAIRQALMPVTPQASVCTSPGQGKDHKDLSTSISVSNCQIQE 
NVDIS T VYQ I FADEVTjGSGQ FG I VYGGKHRKTGRDVA IKVIDKM 
RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 
LHGDMLEMXLSSEKSRLPERITKFMVTQII.VAI1RNLHFKNIVHC 
DLKPENVLIiAS AE P F PQVKLCDFGFAR I IGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVIIYVSLSGTFPFNEDEDINDQI 
QNA7VFM YPPNPWRE I SGEAIDL INNLLQVKMRKRYS VDKSLSHP 
WLQDYQTWLDLRE FETRIGERY ITHES DDARWE IHAYTHNLVYP 
KHFIMAPNPDDMEEDP 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPWKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTLATQS PFNDRPMCRICHEGS SQEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

cor re sponding 
to first 
amino acid 

residue of 
amino acid 
sequence 


Predicted end i 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine. I=lsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine , V«Val ine , 
W=Tryptophan, Y«Tyrosine, X=»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEKWLSSSNTSYCELCHFRFAVERKPR 
PLVEWLRNPGPQHEKRTLFGDMVCFLFITPLATISGWLCLRGAV 
DHLHFS SRLEAVGL I ALTVALFT I YLFWTbVSFRYHCRLYNEWR 
RTNQRVI LLI PKSVNVPSNQPSLLGLHSVKRNSKETVV 


5464 


195 


677 


SPSMNPRKKVDLKL I IVGAIGVGKTSLLHQYVHKTFYEEYQTTIj 
GASILSKIIILGDTTLKLQIWDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQS I LENHLTE S I KLS PDQS RSRCC 


5465 


5278 


3348 


KGDPREFIRVHREALECDYVSAHLHEWIDLIFGYKQQGPAAVEA 
VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHIjDNLRPSLTPV 
KELKEPVGQIVCTDKGILAVEQNKVIjIPPTWNKTFAWGYADLSC 
RLGTYESDKAMTVYECLSEWGQIIiCAICPNPKLVITGGTSTWC 
VWEMGTS KEKAKTVTLKQALLGHTDTVTCATASIiAYHI IVSGSR 
DRTCIIWDLNKLSFLTQLRGHRAPVSALCINELTGDIVSCAGTY 
IHVWS INGNP I VSVNTFTGRSQQ 1 1 CCCMSEMNEVJDTQNVIVTG 
HS DG WRFWRME FLOVPETP AP E PAE VLEMOED CPE AO I GOEAO 
DEDSSDSEADEQS I SQDPKDTPSQPSSTSHRPRAAS CRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNPIEVRNYSRLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 
EVTALGISKDHSRILVGDSRGRVFSWSVSDQPGRSAADHWVKDE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
SSPVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRS VMG IQTS P VLLASLGVGLVTL 
LGT JVVG S YLVRRS RRPOVTLLD PNE KYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LTYTGKGHFN IQPNKKSP PEPRVAKKLGMI AGGTG ITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYS KGFVTADMI REHLPAPGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQARIF IQKKDLEEDES VTAAHLKS RG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECIjQPDQRTL 
YRDVMLENYSHIjI SLAGSS I SKPDVITLLEQEKE PWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVl^PKQVIKQISTTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKMISYEKLPTHTPHASLICNT 
HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCKECGKAFQLH 
IQLTRHQKFHTGEKTFECKECGKAFNLPTQIjNRHKNIHTVKKLF 
ECKECGKSFNRSSNLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKECGMAFRYKYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHOKIHTGEKPFECRECGKAFSLLNQIiNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLIEHSRIHTG 
DKPFECQDCX3KAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECGKAFRLHMHL IRHQKLHTGEKPFECKECGKAFRLHMQL I 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 


225 

t 


2976 


SFLTDLFQSLAQLENLCKQLYETTDTTTRLQAEKALVEFTNSPD 
CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTNNPLPLEQRIDI 
RNYVLNYLATRPKLATFVTQALIQLYARITKLGWFDCQKDDYVF 
RNAITDVTRFLQDSVEYCIIGVTILSQLTNEINQVSATAFLIEA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNliND 
ES QHGLLMQLL KLTHNCLNFDFIGTS TDES SDDLCT VQ I PTS WR 
SAFLDSSTLQLSTIGRCEYEKTCALLVQLFDQSAQSYQELLQSA 
SASPMD I AVOEGRLTWLVY I IGAVIGGRVS FASTDEQDAMDGEL 
VCRVLQLMNLTDSRLAQAGNEKLELAMLSFFEQFRKIYIGDQVQ 
KSSKLYRRLSEVLGLNDETMVLSVFIGKIITNLKYWGRCEPITS 
KTLQLLNDLSIGYSSVRKLVKLSAVQFMLNNHTSEHFSFLGINN 
QSNLTDMRCRTTF YTALGRLLMVDLGEDEDQ YEQ FMLP LTAAFE 
AVAQMFSTNSFNEQEAKRTLVGLiVRDLRGIAFAFNAKTSFMMLF 
EWIYPSYMPILQRAIELWYHDPACTTPVLKLMAELVHNRSQRLQ [ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(As=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q»Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FDVSSPNGILLFRETSKMITMYGNRILTLGEVPKDQVYALKLKG 
ISICFSMLKAALSGSYVNFGVFRIiYGDDALDNALQTFIKLLLSI 
PHSDLLDYPKLSQSYYSLLEVLTQDHMNFIASLEPHVIMYILSS 
ISEGLTALPTMVCTGCCSCLDHIVTYIjFKQLSRSTKKRTTPLNQ 
ESDRFLHIMQQHPEMIQQMLSTVLNIIIFEDCRNQWSMSRPLLG 
LILIjNEKYFSDLRNSIVNSQPPEKQQAMHLCFENLMEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLLYQLVFLIjSEAHTC 
VPENNGGAGCVCHLIiMDDVVSADNYTIiDLWAGQQLLWKGS FKPS 
EHVXPRAPGNLTVHTOVSDTLLLTWSNPYPPDNYLYNHLTYAVN 
IWSENDPADFRIYNVTYLEPSLRIAASTLKSGISYRARVRAWAQ 
CYNTTWSEWS PSTKWHNS YREPFEQHLLLGVS VS CI VI LAVCLL 
CYVS ITKIKKEWWDQI PNPARSRLVAII IQDAQGSQWEKRS RGQ 
EPAKCPHWKNCLTKLLPCFLEHNNKRDEDPHKAAKEMPFQGSGK 
SAWCPVEISKTVLWPESISWRCVEIjFEAPVECEEEEEVEEEKG 
SFCASPESSRDDFQEGREGIVARLTESLFLDLLGEENGGFCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPASPTQSPDNIiTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLLARHLEEVEPEMPCVPQLSEPTTVPQPEPETWEQILRRNV 
LQHGAAAAP VSAPTSGYQEFVHAVEQGGTQASAWGLG PPGEAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATDPLVDSLGSGIVYSALTCHIiOGHLKQCHGQEDGG 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SIiAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 

• 


17 


1418 

r 

1 


TACRI RTSLNRG I AAVKEDAVEMLAS YGLiAYS LMKFFTG PM S D F 
KNVGLVFVNS KRDRTKAVLCMWAGAI AAVFHTL IAYSDLGY YI 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
SFLVGCAS ISDVIAQWFVAILLHSHLECREPLIiIP ILSLYMGA 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLA 
LILATQRISRPIVNLFVSRDLGGSSAATEAVAILTATYPVGHMP 
YGWLTE I RAVYPAFDKNNPSNKIjVS TSNTVTAAH I KKFTF VCMA 
LSLTLCFVMFWTPNVSEKI L XDI IGVDFAFAELCWPLR I FS FF 
P W VTVRAI&TOWLMTLiaCT F\HjAPS S VLRI IVL I ASLWLP YL 
GVHGATLGVGS LLAGFVGE S TMDAI AACYVYRKQ KKKMENES AT 
EGEDSAMTDMPPTEEVTDI VEMREENE 


5471 


1868 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAI KKI S P FEHQTYCQRTLREIQ 1 LLRFRHENV I GI RD I LR 
ASTLBAMRDVYIVQDLMETDLYKLLKSQQLSNDHICYFIiYQILR 
GLKYIHSANTVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLS 
NRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINMKARNYLQSL 
PSKTKVAWAKLFPKSDSKALDLLDRMLTFNPNKRITVEEALAHP 
YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELI FQETARFQ 
PGVLEAP 


5472 


1469 


753 


LYVMARYLSDEEVAVSIDRLCKANGRSPSIPFGTVRIPGRARVR 
DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 
VLGGYDTKEVTFYPQDAPDQPLKAIiAYVATPQNPGYLGPAPEEA 
IATQILACRGFSGHNLEYLLRVRDVMQLCGPQAQDEHLAAIVDA 
VGTMLPCFCPTEQALALV 


5473 


3 


2119 


FMNVTOjLIQDLEDIEQRVPVMDAQYKIITKTAHLITKESPQEEG 
KEMFATMSKLKEQLTKVKECYSPLLYESQQLLIPLEELEKQMTS 
FYDSLGKINEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQSMVKK 
TGDWKKHVErNSRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 
LLRRHTE FFSQLDQRVIiNAFLKACDELTD I LPEQEQQGLQEAVR 
KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTELDRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co r re sp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histiaine, i=isoieucine, K.=*Jjysxne , 
L=Leucine, M=Methionine, N=Asparagine, 
P=»Proline, Q«=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


■ 


■ 




KLMPQEGSEKriKEHRVFFSDKGPHHLCEKRLQLIEELCVKLPV 
RDPVRDTPGTCHVTLKELRAAIDSTYRKLMEDPDKWKDYTSRFS 
E FSSWI STNETQLKG I KGEAIDTANHGEVKRAVEEIRNGVTKRG 
ETLSWLKSRLKVXjTEVSSENEAQKQGDEL^^ 

evekmiisnfgdcvqykeivknsleelisgskevqeqaekildte 
nlfeaqqlllhhqqktkrisakkrdvqqqiaqaqqgegglpdrg 

HEEIjRKLESTLDGLERSRERQERRIQVTLRKWERFETNKETWR 

ylfqtgssherflsfssleslsseleqtkefskrtesiavqaen 
iivkeaseiplgpqnkqllqqqaksikeqvkkledtleeeyvidk 

s 


5474 


2 


760 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKSGWLLRQSTI 
LKRWKKNWFDL W SDGHL I Y YDDQTRQN I ED KVHM PMDCINI RTG 
QECRDTQPPIKSKSKDCMLQrvCRDGKTlSLCAESTDDCriAWKFT 
LQDSRTNTAYVGSAVMTLDJi lbVvbblri'Fi lAiAAFAAravvjKi Jjo 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVIIRERYTlDNDSDIALGMIiAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSLTCQITPPPSSPCIjLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNISLAVRKIALLLKPDKEIEHQGNHMTVRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYIiELTARDAVCEQVFRKVR 


5475 


192 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRASEVLCSTNVSHYELQVEIGRGFDNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
SWLWVISPFMAYGSASQIiLRTTFPEGMSETLIRNILFGAVRGIjN 
YLHQNGCIHRSIKASHILISGDGLVTLSGLSHLHSIiVKHGQRHR 

avydfpqfstsvqpwlspellrqdlhgynvksdiysvgitacel 
asgqvpfqdmhrtqmllqklkgppyspldisifpqsesrmknsq 
sgvdsgigesvlvssgthtvnsdrlhtpssktfspaffslvqlc 
lqqdpekrpsassllshvffkqmkeesqdsilsllppaynkpsi 
slppvlpwtepecdfpdekdsywef 


5477 

* 


3 

• 


1044 


RGNSRIiRYSHEDELQLPRLPELFETGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLEKAAEMLSQIiDLFSRNEDLEE IAS TDjjKYJjLV 
PAFQGALTMKQVNPSKRLDHLQRAREHFINYLTQCHCYHVAEFE 
LP KTMNNSAENHTANSSMAYPSLVAMAS QRQAKI QRYKQKKELE 
HRLSAMKSAVESGQADDERVREIYYLLHLQRWIDISLEEIESIDQ 
EIKILRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMWSDWYEQHRKYGALPDQGIAKAAPBEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRSVHF CSDGQSFVTASDDK. 1 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
LWDKSSRECTOSYCEHGGFVT5TVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 

LENQQIilMQRATP 


5479 


2 


■h M 

835 


KTVRIWVPNVr^ESTVFRAHTATVRSVn V lAaUUxs.1 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRIilVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLS FHPSGNYLI TAS SDSTLKILDL 
MEGRLLYTLHGHCK3PATTVAFSRTGEYFASGGSDEQVMWKSNF 
DIGDHGEVTKVPRPPATbASSMGNLTVSILEQRIjTIiEEDKLKQC 
LENQQLIMQRATP 


5480 


444 


1952 


LSLTSRMEEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH 

LRLEKE IQDLEKAELQ I S TKEEAILKKLKS I ERTTED I IRS VKV 
EREERAEES I EDI YANI PDLPKS YI PSRLRKE INEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSSIPLPSDDFKGTGIKVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKS PTEYH 
EPVYANPFYRPTTPQRETVTPGPNFQERIKI KTNGLG I GVNES I 
HNMGNGLS EERGNNFNH I S PI P PV PHP RS V I QQAEE KLHT PQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti(5e 
(A=Alanine, C=Cysteine, D=Jtepartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K» Lysine, 
L^Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWE E SNVMQDKDAPSPKPRLS PRET I FGKSEHQNS SPTCQE 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHKS 


5481 


3 


1422 


NSPGSVCLCQCTCPSLLHCLPPLLLLLLLPLLLiHESPQPPALRV 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
QGLNEAGDDLEAVAKFLDSTGSRLDYRRYADTLFD ILVAGSMLA 
PGGTRIDDGDKTKMTNHCVFSANEDHETIRNYAQVFNKLIRRYK 
YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSGILLGNGTLPAT 
ILTSLFTDSLVKEGIAASFAVKLFKAWMAEKDANSVTSSLRKAN 
LDKRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
KELQKEIiQERLSQECP I KEVVLYVKEEMKRNDLPETAVIGLLWT 
C IMNAVE WNKKEELVAEQALKHLKQYAPLLAVFSSQGQS ELI LL 
QKVQEYCYDNIHFMKAFQKIWLFYKADVLSEEAILKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


5482 


1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRL 
EGLQEKDSGPYSCSVHVQDKQGKSRGHSIKTLELNVLVPPAPPS 
CRLQGVPHVGANVTLSCQSPRSKPAVQYQWDRQLPS FQTFFAPA 
LDVIRGSLSLTNLSSSMAGVYVCKAHNEVGTAQCNVTLBVSTGP 
GAAWAGAWGTLVGLGLLAGLVLLYHRRGKALEEPAND IKEDA 
IAPRTLPWPKSSDTISKNGTLSSVTSARALRPPHGPPRPGALTP ' 
TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


5483 


1 


788 

* 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA 
ENRIKQLBTDSSEEISRYQEMIQKLQNVLBSERENCGLVSEQRL 
KLQQENKQLRKETESLRKIALEAQKKAKVKISTMEHEFS I KERG 
FEVQLREMEDSNRNS I VELRHLLATQQKAANRWKEETKKLTESA 
EIRINNLKSELSRQKLHTQELLSQLEMANEKVAENEKLILEHQB 
KANRLQRRLS QAEERAASASQQLS VITVQRRKAASLMNLENI 


5484 


3 

* * * * 
* 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 

ESDQDERGDSGQPSNKELFGDDSBDEGASHHSGSDNHSERSDNR 

SEASERSDHEDNDPSDVDQHSGSBAPNDDEDEGHRSDGGSHHSE 

AEGSEKAHSDDEKWGREDKSDQSDDEKrQNSDDEERAQGSDEDK 

LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 

SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSBSARGSDSBDEVLRMKRKNAIASDSEADSDTEVPKD 

NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 1 

P IPETRIEVEI PKVNTDLGNDLYFVKLPNFLSVEPRPFDPQYYE 

DEFEDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI KESNAR 

IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 

VFKTKLTFRPHSTDSATHRKMITjSLADRCSKTQKIRILPMAGRD 

PECQRTEMI KKE EERLRAS I RRESQQRRMREKQHQRGLSAS YLE 

PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 

EDKAQRLLKAKKLTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 

AGTN 


5485 


161 


1074 


KKKILSSM^SEAHEKRPPILTSSKQDISPHITlSrV'GEMKHYLCG 
CCAAFNNVAI TFP IQKVLFRQQLYGI KTRDAILQLRRDGFRNL Y 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVPI LFRNGLSNVLFFGLRGP IKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINWKTRIQSQIGGEFQSFPKVFQKI 
WLERDRKLINLFRGAHLNYHRSLISWGI INATYEFLLKVI 


5486 


1404 


142 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERSPR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GS LATS ISQMVKTEGKGAKRKTSEEE KNGS EELVEKKVCKASS V 
IFGLKGYVAERKGEREEMQDAHVILNDITEECRPPSSLITRVSY 
FAVFDGHGGIRAS KFAAQNLHQNLIRKFPKGDVI S VE KTVKRCL 
LDTFKHTDEBFLKQASSQKPAWKDGSTATCVLAVDNILYIANLG 
DSRAILCRYNEESQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
DGRVLGVLEVSRS IGDGQYKRCGVTSVPDI RRCQLTPNDRFILL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

iuca lion 

corresponding 

UU IXloU 

amino acid 
residue of 

amino arid 

sequence 


Predicted end 

nucleotide 

location 

r*OTTfs»etrionrfi no 

to first 

Ami tin acid 
residue of 
amino acid 
secruence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine I=Isoleucine, K=Lvsine. 
L=t*eucine, M=Methionine, N=Asparagine , 
P=Proline. 0=Glutamine, R=Arcrinine, 
S=Serine, T=Threonine, V=*Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGIiFKVFTPEEAVWFILSCLEDEKIQTREGKSAADARYEAAC 
NRLANKAVQRGSADNVTVMVVR I GH 


5487 


535 


182 


AVSLEQIRGLQTPAPVPLPLQPCPSNCDMERVTliALIiLLAGLTA 
LEANDPFANiaJDPFYYBWIQJLQLSGLICGGLLAIAGIAAVLSGK 
CKCKSSQKQHSPVPEKAI PLITPGSATTC 


5488 






nMAAQrtFPfYR fiWOFFVAAVAAR/n^PMTDTjVSLTSRLPKTGETIH 
GHKFF IGFGGKGANQCV©AAiU<GAMTSMVCKVGKDSFGNDYIEN 
LKQNDISTEFTYQTKDAATGTASI IVNNEGQNI IVIVAGANLLL 
NTEDLRAAANVI SRAKVMVCQLE I TPATSLEALTMARRSGVKTL 
FNP APAIADLDPQFYTLSDVFCCNES EAE I LTGLTVGSAADAGE 
AALVLIjKRGCQVVIITLGAEGCVVLSQTEPEPKHIPTEKVKAVD 
TTVSFKI 


5489 


O 1 

OA 


QQ 1 ! 


czv n p\/a A F t nft q m t FT iTDP Kl FT .TCOWREEPKM PLLLLrGETEPLK 
LERDCRSPVEPWAAASPDIATACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AIRKKQQE WGFLEANKI DFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDS F 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


ca on 


Q 1 
01 




rixca PVA AFT DO <?NT FLTDPKI FTjGOWREEPKMPLLLLGETE PLK 
LERDCRS PVEPWAAASPDLA1^CX.CHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEANKIDFKELD 
I AGDEDNRRWMRENVPGEKKPQNGI PLPPQIFNEEQYCGDFDSF 
FSAKEEN 1 1 YS FLGLAPP PDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPPJ^SLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE 

l^ELFEKAAAHLQGLIQVASREQLLYLYARYKQVKVGNCNTPKP 
S FFDFEGKQKWEAWKALGDSS PSQAMQE YIAVVKKLDPGV3NPQ I 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNIFDYCRENNIDH 
ITKAI KS KNVDVNVKDEEGPALLHWACDRGHKELVTVLLQHRAD 
INCQDNEGQTALHYA5ACEFLDIVELLLQSGADPTLRDQDGCLP 
EF^mJCKTVSLVLQRHTTGKA 


5492 


3 


1896 


AS KN PLS AVCTTGIMS SLAVRDPAMDRSLRSVFVGN I PYEATEE 
QLKDIFSF^GSWSFRLVYDRETGKPKGYGFCEYQDQETALSA^ 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAPI IDSPYGDP 
IDPEDAPESITRAVASLPPEQMFE^KQMKTjCVQNSHQEARNML 
LQNPQ^YALr^AQVVMRIMDPEIAJjKILHRKIHVTPLIPGKSQ 
SVSVSGPGPGPGPGIjCPGPNVLLNQ(^PPAPQPQHLARRPVKDI 
PPLMQTPIQGGIPAPGPIPAAVPGAGPGSLTPGGAMQPQU3MPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGbPPRGLLGDAPNDPR 
dCLTT ,t , c vTRTOP PPG YTjGPPHOGP pmhhasghdtrg p s SHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRD2RAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGP VPS SRGP 
MTGG IQG PGP INIGAGGPPQGPRQVPGISGVGNPGAGMQGTGI Q 
GTGMQGAGI QGGGMQGAGIQGVS IQGGGI QGGGIQGAS KQGGSQ 
PSS FS PGQSQVTPQDQEKAAL IMQVLQLTADQ IAMIiP PEQRQS I 
LILKEQIQKSTGAS 


5493 


1 

* 


1876 


RAPMMTKAVPEEPRKPGRLTQALNS PLTWEHVW ICVPGGT PDCL 
TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMYDEI IELKKSLH 
VQKSDVDLMRTKLRRLEEENSRKDRQIEQIjLDPSRGTDFVRTLA 
EKRPDASWINGLKQRILKLEQQCKEKDGTISKLQTDMKTTNLE 
EMRIAMETYYEEVHRLGTLLASSETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDIiDRVLSTSPTISKTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 
QAKADLEKELECAREGEEERREREEVIjREEIQTIjTSKLQELQEM 

kkeekedcpevphkaqelpaptpssrhceqdwppdsseeglprp 
rspcsdgrrdaaaj^vlqaqwkvykhkkkkavldeaavvlqaafr 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Me thionine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y» Tyro sine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLLASKAHGS EPPS VPGIi PDQSS PVPRVPS P I AQATGS 
PVQEEAIVI IQSALRAHLARARHSATGKRTTTAASTRRRSASAT 
HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


536 


RSKAKIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLLGT 
RRVLU3VSEGTGCMAMEL\^VFLCSLLAPMVLASAAEKEKEMD 
PFHYDYQTIiRIGGLVFAWLFSVGILLILSRRCKCSFNQKPRAP 
GDEEAQVENL I TANATEPQKAEN 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAFERFCQVWTGPLPLLGQSEPEKWMLPP 
QGAISETRMGHPQFWKYEFGACTGSLASIiEQySEQLKDMVAFFIi 
GCS FSLEEALEKAGLPRRDPAGHS QAGAYKTTVPCVTHAGFCCP 
LWTMRPIPKDKLEGLVRACCSLGGEQGQPVHMGDPELLGI KEL 
SKPAYGDAMVCPPGEVPVFWPSPLTSLGAVSSCETPLAPASIPG 
CTVMTDtiKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
G F P TH FNHE P PE E TDG P PGAVAL VAFLOALE KE VAI I VDORAWN 
LHQKI VEDAVEQGVLKTQ I P ILTYQGGSVEAAQAFLCKNGDPQT 
PR FDHLVAI ERAGRAADGNYYNARKMN I KHLVD P I DDL F LAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 
WTQALPSVIKEEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMIQKLVDVTTAQV 


5496 


3 


2408 


QDTKMHE I YKGN I TPQLNKNTLKTSAATDVWAVY FSQFW I DYEG 

MKSGKGRP IS PVDS FPLS I W I CQPTRYAESQKEPQTCNQVSLNT 

SQSESSDLAGjRLKRKKLLKEYYSTBSEPLTNGGQKPSSSDTFFR 

FSP S S SEADIHLLVHVHKHVSMQ INHYQ YLLLLFLHESIiI LLS E 

NLR KD VEAVTGS PAS QTSICIGI LLRS AELALLLHP VDQANTL K 

SPVSESVSPVVPDYLPTENGDFLSSKRKQISRDINRIRSVTVNH 

MSDNRSMSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYLSDKH 

LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRE 

DSNIIiSFDSDGNQNII^SSTLTSKGNBTIESIFKAEDLLPEAASIi 

SENLDISKEETPPVRTLKSQSSDSGKPKERCPPNIiAPLCVSYKN 

MKRSSSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKGNKKNS 

TITJYRGTAESVNAGANtiQNYGETSPDAI S TNSEGAQEWHDDLMS 

VWFKI TGVNGE I D IRGEDTE I CLQVNQVTPDOLGNI SLRHYLC 

NRPVGSDQKAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFLQ 

CHIKNFSTEFLTSSL^IQHFLEDETVATVMPMKIQVSNTKINL 

KDDSPRSSTOSLEPAPVTVHIDHLVVERSDDGSFHIRDSHMIiNT 

GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 1 

ANFPEFSFDFTREQI^EENESLXQELAKAKMALAEAHLEKDALL 

HHIKKMTVE 


5497 


1821 


3308 


S I SKLLKRRSNIDAYLLSNS CAFFAPRLFSLASQI IREQQS PNV 
CFIYKYSGFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQ I PSWKDWAKPGPYDQPLVNTLQRRKEKRE PD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEEIiA 
LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 
TPVIPVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEHRQAIPESEAEDQER 
EPPSATVS PGQI PESDPADLS PRDTPQGEDMLNAI RRGVKLKKT 
TTNDRSAPRFS 


5498 


2434 


1492 


ILTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKPFECNECX3KAFSQKQYVIKHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 
PFKCSBCGTAFGQKKYLIKHQNIHTGEKPYECNECGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NEC^KAFSQFSTLALHLRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histictine, is=isoieucine r K^iiysine, 
L^Jjeucine , ^netnioninc , in ~nopa.i.a.\j June , 
P»Proline, Q=Glut amine, R=Arginine, 
S=*Serine, T=Threonine, V= Valine, 
W=Tryptophan r Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 








QKIHTH 


5499 


324 


926 


GFGQIGRGHKITTYPFS PRKSGRKGMAQSQGWVKRY I KAFCKGF 
FVAVP VAVT FLDRVACVARVEGASMQP SLNPGGS QSS D WLLNH 
WKVRNFEVHRGDIVSLVS PKNPEQKI IKRVIAIjEGDIVRTIGHK 
NRYVKVPRGHIWVEGDHHGHSFDSNSFGPVSLGLLHAHATHILW 


5500 


1978 


1286 


KPD WRLQNLP PRL YLWRS S RFGFGHLKKRLQMDFKI EHT WDGF P 
VKHEPVF IRLNPGDRGVMMD I SAP FFRDPPAPLGE PGKP FNELW 
DYEWEAFFLND I TEQYLE VELCPHGQHLVLLLSGRRNVWKQEL 
PIiSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 
YEALYPVPQHEIiQQGQKPDFHCLEYFKSFNFNTLLGEEWKQPES 
DLwLIEKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEVVDAAARPSSRPFSLP 
AA1MLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGQFSEDMIPTVGFNMRKVTKGNVTIKIWDIGGQPRFRSMWERY 
CRGVNAIVYMIDAADREKIEASRNELHNLLDKPQLQGIPVLVLG 
NKRDLPNALDEKQLIEKMNLSAIQDRE I CCYS I S CKEKDNIDIT 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFP VWVPERTALLi rCFiAxAAirvab b ithAPV? 1 Aw±> PJNio lKrlo lS-L» 

GKFFKGGGSSKSRAAPSPQEALVRLRETEEMLGKKQEYLENRIQ 
REIALAKKHGTQNKRAALQALKRKKRFEKQLTQIDGTLSTIEFQ 
REALENSHTNTEVLRNMGFAAKAMKSVHENMDLNKIDDLMQEIT 
EQQDIAQEISEAFSQRVor wJUr UlLUsi*LiNJT&xjatiiun\^nnLw 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 

IKQLAAWAT 


5503 


216. 


654 


KGVRRRGRVRSDS EDSHLG Y FKMSFLLP KLTS KKEVDQAI KSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDIiSKMAAIYLVDVDQT 
AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVNS 


5504 

• 


58 


3563 


QLSFSFQAPVTFDDITVYLLQEEWVLLSQQQKELCGSNKLVAPL 
GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYMG 
EMEVQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSIRDKRSRL 
IEGYTGPFKVETLKYHAKSKAHMFCVNALAARDPIWAARFRSIR 
DP PGDVLAS PEPLFTADCP I F YP £GPLGG FDSMAELLPSSRAEIi 
EDPGGI&XlipAMYiDCISD^ 

SCIQDPSAEGIiSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 
QKELYRDVMRMNYELLASLGPAAAKPDLISKLERRAAPWIKDPN 
GPKWGKGRPPGNKKMVAVREADTQASAADSAI/LPGS PVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCSACIERPNLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 
NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACXQFIKYISETLKREILED 
VRNSPCVSVIiLDSSTDASEQACVGIYIRYFKQMEVKESYlTLAP 
LYSETADGYFETIVSALDELDIPFRKPGWWGI/3TDGSAMLSCR 
GGLVEKFQEVI PQLLP VHCVAHRIiHLAWDACGS I DLVKKCDRH 
I RTVFKF YQS SNKRLNELQEGAAPLEQE I IRLKDLNAVRWVASR 
RRTIiHALLVSWPALARHLQRVAEAGGQIGHRAKGMLKLMRGFHF 
VKFCHFLLDFLSIYRPLSEVCQKE IVLITEVNATLGRAYVALES 
IjRHQAGP Kb Jci h b NAS r KLXjKIjHvj ± L-bJJ AJjr* VAfiyw r y/iuK.ni\ i v 
LTGIEYLQQRFDADRPPQLKNMEVFDTMAWPSGIEIASFGNDDI 
LNLARYFECSLPTGYSEEALLEEWLGLKTIAQHLPFSMLCK3NAL 
AQHCRFPLLSKLMAVVVCVPISTSCCERGFKAMl^IRTDERTKL 
c!NPVLNMLMMTAVNGVAVTEYDPOPAIQHWYLTSSGRRFSHVYT 
CAQVPARSPASARLRKEEMGAL YVEE PRTQKP P ILPSREAAEVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCSPRSLSAAKMSNRNNNKLPSNLPQLQNLIKRDPPAYIEEFLQ 
QYNHYKSNVEIFKLQPNKPSKELAELVMFMAQISHCYPEYLSNF 
PQEVKDIiLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
FFELFRCHDKLLRKTLYTHIVTDIKNINAKHK1WKVNVVLQNFM 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, Ioisoleucine, K»Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=»Proline, Q=Glutamine # R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAT^KMSLDVMIELYRRNIWNDAKTVNVITTACFSK 
VTKILVAALTFFLGKDEDEKQDSDSESEDDGPTARDLLVQYATG 
KKSSKNKKKLEKAMKVLKKHRKKKKPEVFNFSAIHLIHDPQDFA 
EKLLKQLECCKERFEVKMMLMNLI SRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILtLFAAQASHHLVPPEI iqsllmtvannfvtdk 
NSGE VMTVG INAI KB 1 TARCPLAMTE ELLODLAOYKTHKDKNVM 
MSARTLIHLFRTLNPQMLQKKFRGKPTEAS IEARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEEEDADGEWIDVQH 
SSDEEQQE I SKKLNSMPMEERKAXAAAISTSRVLTQBDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERIiHKK 
PKSDKETRIiATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRS KNKRS FREKQLALRDALLKKKKRMK 


5506 


1 

• 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWS PLEGGGDEELRPNPYVRFP YRWWAWV 
LAAFPSLGAGGETPEAP PES WTQLWFFRFWNAAGYAS FMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAETTPMWQALKTjLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TS PGERFTDSQ FLVLMNRVLAL I VAGLS CVLCKQPRHGAPMYRY 
SFASLSNVLSSWCQYEALKFVSFPTQVLAKASKVIPVMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAG YIAFDS FTSNWQDALFAYKMSS VQMMFG VNFFSCLFTVGSL 
LEQGALLEGTRFMGRHSEFAAHALLLSICSACGQLFIFYTIGQF 
GAAVFTI IMTLRQAFAILLSCLLYGHTVTWGGLGVAWFAALL 
LRVYARGRLKQRGKKAVPVESPVQKV 


5507 

■ 


3704 

• 
• 

> 

* 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE IGGFGTAAGKK 
VAVADVQPGPMRFHQDQLQVXiIiVFTKEDNQCNGFCRACEKAGFK 
CTVTKEAQAVLACFLDKHHDI 1 1 1 DHRNPRQLDAEALCRS I RSS 
KLS ENTVI VGWRRVDREELSVMP F ISAGFTRRYVENPNIMACY 
NELLQLEFGEVRSQLKLRACNSVFTALENSBDAIEITSEDRFIQ 
YANPAFETTMGYQSGELIGKELGBVPINEKKADLLDTINSCIRI 
GKEWQG I YYAKKKNGDNIQQNVKI I PVIGQGGK I RHYVS I IRVC 
NGNNKAEKISECVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
EVSSQRRHSSMARIHSMT I EAP ITKVI NI INAAQE SS PMPVTEA 
LDRVLEILRTTELYSPQFGAKDDDPHANDLVGGLMSDGLRRLSG 
NEYVLSTKNTQMVS SN I ITP ISLDDVPPRIARAMENEEYWDFDI 
FRT »K AATHNRTpLI YjjGLKMFARJgTj 1 C£FLHCSESTLRSWLQ I IE 
ANYHSSNPYHNSTHSADVLHATAYFLSKERIKETIiDPIDEVAAL 
I AAT I HDVDHPGRTNSFLCNAGSELAI LYNDTAVLESHHAALAF 
QLTTGDDKCNI FKNMERND YRTLRQG I IDMVLATEMTKHFEHVN 
KFVNS INKPLATLEENGETDKNQEVINTMLRTPENRTIjI KRMLI 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEEKQQGLPWMPV 
FDRNTCSIPKSQISFIDYFITDMFDAWDAFVDLPDLMQHLDNNF 
KYWKGIjDEMKLRNIiRPPPE 


5508 


1151 


691 


LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN 
VL KKVL VDQLVAS PLLGVW YFLGLG CLEGQTVGES CQEhRERFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRS P VP LT PPG C VALDTRAD 


5509 


1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLKQVDFLNWE 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDVVTDPAFL»VTRSM 
EDFVTWVDSSKIKRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHLSSGS SEPLVEPGRGR VGARVKGERGLQASGSAPGRS KM 
AEGERQPPPDS SEEAP PATQN F 1 1 PKKE IHTVPDMGKWKRSQAY 
ADYIGF I LTLNEGVKGKKLTFEYRVSEAIEKIiVALLNTLDRWID 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKES VGNSTR I DYGTGHEAAFAAFLCCLCKIGVLRVDDQ 
IAIVT'KVFNRYLEVMRKLQKTYRMEPAGSQGVWGLDDFQFLPFI 
WGSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLECILFITEMKT 
GPFAEHSNQL.WNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknovm , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 

r 


KLSRVLNLPPENLITSISAVPISQKEEVADFQLSVDSLLEKDND 
HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINREIiLTK 
TVLQQVIEDGSKYGLKSELFSGLPQKKIWEFSSPNVAKKFHVG 
HLRST 1 1 GNFIANLKEALGHQVTRINYLGDWGMQFGLLGTGFQb 
FGYEEKLQSNPLQHLFEVYVQVNKEAADDKSVAKAAQEFFQRLE 
LGDVQALSLWQKFRDLSIEBYIRVYKRLGVYFDEYSGESFYREK 
SQEVhKhhBS KGLLhKTIKGTAWDhSGNGDPSS ICTVMRSDGT 
SLYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLK1 
MG YDWAERCQHVP FGWQGMKTRRGDVTFLEDVLNE IQLRMLQN 
MASIKTTKELKNPQETAERVGLAALIIQDFKGIjLLSDYKFSWDR 
VFQSRGDTGVFLQYTHARLHSLEETFGCGYLNDFNTACLQEPQS 
VSILQmiliRFDEVLYKSSQDFQPRHrVSYLLTLSHLAAVAHKTL 
QIKDSPPEVAGARLHLFKATOSVLANGMKLLGITPVCRM 


5512 

• 


120 


1015 


DPSLLLTITVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
IEKEKISHNTRRFRFGLPSPDHVLGLPVGNYVQLLAKIDNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYIiENMK 
IGETIFFRGPRGRIiFYHGPGNLGIRPDQTSEPKKTLADHljGMIA 
GGTGITPMLQIiIRHITKDPSDRTRMSLIFANQTEEDILVRKELE 
BIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STL1 LVCGP P PLIQTAAHPNLEKLGYTQDMI FTY 


5513 


2 


837 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGKTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 
GKMLDKYIYGAQGVLLVYDITNYQSFENLEDWYTWKKVSEESE 
TQPLVALVGNKI DLEHMRTI KPBKHLRFCQENGFSSHFVSAKTG 
DS VFLCFQKVAAEILGI IGjNKAE I EQSQR WKAD I VNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


VNRPS WIMGNFRGHALPGTFFFI iglwwctks ilkyickkqkrt 
C YLGS KTLF YRIiE ILEG I T I VGMALTGMAGEQ F I PGG PHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVEAFI FYNHTHGREMLDI FVHQIiLVLWFLTGLVAFL 
EFLVRNNVLLELLRSSBILLQGSWFFQIGFVLYPPSGGPAWDLM 
DHENILFLTICFCWHYAVTIVIVGMNYAFXTWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


5515 


1572 

• 


260 

A . . *. 17' . ' . '. 


FVRLVGRGD CD P LLS VCLTTMP LYEGLGSGG EKTAWIDLGEAF 
TKGGFAGETGPRCTrPSVIIGtAGMPKPVRVVQYNINTEELYSYL 
KEFIH ILYFRHLLVNPRDRRWI IESVIjCPSHFRETLTRVLFKY 
FEVPSVLLAPSHLMALLTLGINSANVLDCX3YRESLVLPIYEGIP 
VIjNCWGALPIXSGKALHKELETQLLEQCTVDTSVAKEQSLPSVMG 
S VPEGVLEDIKARTCFVSPLKRGLKIQAAKFNIDGNNERPS PPP 
NVDYPLDGEKILHILGSIRDSWEILFEQDNEEQSVATLILDSL 
IQCPIDTRKQLAENIiWIGGTSMLPGFLHRLLAEIRYLVEKPKY 
KKALGTKTFRIHTP PAKANCVAWLGGAI FGALQDI LGSRSVSKE 
YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREPPQAGPGPSPRKSPTASSFLFPWRPLASSFWMGAQGAQES 
IKAMWRVPGTTRRPVTGESPGMHRPEAMLLLLTLZOiliGGPTWAG 
KMYGPGGGKYFSTTEDYDHEITGLRVSVGI^LVKSVQVKLGDSW 
DVKLGALGGNTQEVTLQPGEY I TKVFVAFQAFLRGMVMYTS KDR 
YFYFGKLDGQISSAYPSQEGQVLVGIYGQYQLLGIKSIGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR • 


5517 


246 


499 


SEI YVAMRTDSSKMTDVESGVANFASSARAGRRNALPD IQSSAA 
TDGTSDLPLKLEALSVKEDAKEKDEKTTQDQLEKPQNEEK 


5518 


3 


1375 

• 


DAWADAWVRAWDLNMDFPCLWLiGLIjLPLVAALDFNYHRQEGMEA 
FLKTVAQNYSSVTHLHSIGKSVKGRNLWVLWGRFPKEHR1GIP 
EFKYVANMHGDETVGRELLLHLIDYLVTSDGKDPEITNLINSTR 
IHIMPSMNPDGFEAVKKPDCYYSIGRENYNQYDLNRNFPDAFEY 
^JNVSRQPETVAVWKWLKTETF VLSANLHGGAL VA5 YP FDNGVQA 
TGAL YS RS LT PDDDVFQ YLAHT YASRNPNMKKGDE CKN KMNF PN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKAS L I E Y I KQVHLG VKGQVFDQNGN P LPNVI VE VQDRK 
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j SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methianine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HI CPYRTNKYGEYYIiLLLPGSYI INVTVPGHDPHITKVI IPEKS 
QNFSALKKDILLPFQGQLDSIPVSNPSCPMIPLYRNLPDHSAAT 
KPSLFLFLVSLLHIFFK 


5519 


87 


477 


I KSKLNQQVEVQESEWRLTEAKGPTMGKBSGTOSGRAAVAAVVG 
G WAVGTVLVALSAMGFTSVG IAASS IAAKMMSTAAI ANGGGVA 
AGSLVAI LQSVGAAGLSVTS3CVI GGFAGTALGAWLGS PPSS 


5520 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHFLVIjSWYTFLNYYI 
SQEGKDEVKPKILANGARWKYMTLLNLLLQTI FYGVTCLDDVLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFWILFLYNRDL 
I YP KVLDTVI P VWLNHAMHTFI FP I TIAEWLR PHS YPS KKTGL 
TLIAAASIAYISRILWLYFETCTWVYPVFAKLSLLGIjAAFFSLS 
YVFIASIYIiLGEKLNHWKWVSVQILQRWRLESVGICFQWPDWKS 
PAKHQLVKNIR 


5521 


546 

* 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPQPSEB 
GVSQEAEGNPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


637 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGSTDHRGVPGKPGRVVTLVEDPAGCVWGVAYRLPVGKEEEVK 
A YLDFREKGG YRTTTVT FYPfCDP TTKP FS VLL Y IGTCDNP D YLG 
PAPLEDIAEQI FNAAGPSGRNTE YLFELANS IRNLVPEEADEHL 
FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 

j 

»■ 


S KGKKRMGS SMSAATARRPVFDDKEDVNFDHFQ I LRAI GKGS FG 
KVC IVQKRDTEKMYAMKYMNKQQC I ERDEVRNVFRELE ILQE IE 
HVFLVNL W YS FQDEEDMFMWDLIiLGGDLRYHLQQNVQFS EDTV 
RLYI CEMALALDYLRGQHI I HRDVKPDNILLDERGHAHLTDFNI 
ATI IKDGERATALSGTKPYMAPEI FHS FVNGGTGYSFEVDWWSV 
GVMAYELLRGWRP YD IHSSNAVESLVQLFSTVS VQYVPTWS KEM 
V^LRKLLT^PEHRhSSLQDVQAAPALAGVLWDHLSEKRVKPG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRIiAKNKSRDNSRD 
SSQSENDYLQDCLDAIQQDFVIFNREKIiKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


5524 


85 


: 2318 

■ 


RERERDHRPGESSQGQSGAGGCFPSPTMELRCGGLLFSSRFDSG 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CAETEFENGNRSWFYFSVRGGMPGKLIKINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWER IRDRPTFEMTETQFVLS FVHRFVEGRGA 

ELLCYSLDGLRVDLLTITSCHGLREDRKPRLEQLFPDTSTPRPF 
RFAGKRI FFLSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTLR 
RLFVFKLIPMLNPDGVVRGHYRTDSRGVNLNRQYLKPDAVLHPA 
IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDLEKAN 
I^LQNEAQCGHS ADRHNAEAWKQTEPAEQKLNSVW I MPQQS AGLE 
ESAPDTI PPKESGVAYYVDLHGHAS KRGCFMYGNS FSDESTQVE 
NMLYPKIilSLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
AI YKASGI IHS YTLECNYNTGRSVNS I PAACHDNGRASPPPPPA 
FPSRYTVELFEQVGRAMAIAALDMAECNPWPRIVLSEHSSLTNIi 
RAWMLKHVRNSRGLSSTLNVGVNKKRGLRTPPKSHNGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGIj 
PGLGSSTQKVTHRVLGPVRGKPVWEPLQHVFGCLGHCWGK 


5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEEFLGRVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 
WRWCTKINKS SGI VEASR IMNLYQF I QLYKDITSQAAGVLAQ 
S STSEEPDENS S S VTSCQAS LWMGRVKQLTDEEECCI CMDGRAD 
LILPCAHS FCQKCIDKWSDRHRNCPI CRLQMTGANES WWSDAP 
TEDDMANY I LNMADEAGQPHRP 


5526 


3 


853 


RRPCNPVRAAKRTGAAARAPRGLEVTMLRVAWRTLSLIRTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDPPPSTI^KDYQNVPGIEKVDDVVKRLLSLEMANKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEEHLEKHRKDK 
AHKRYLLMSIDQRKKMLKNLRNTNYDVFEKICWGIiGIEYTFPPL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 1 
ID j 
NO: 


Predicted 
beginning 
nucleotide 

carresDondincr 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end | 
nucleotide 
location s 
corresDondinci 
to first | 
amino acid j 
residue of j 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucirie, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyxosine, X«Unknown, *=*Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAI PKTLKDSQ 


5527 


3225 


565 

J 


IiLRKYLLHQNPLLLRHQPNRTCISFSATMKLKDTKSRPKQSSCG 
KFQTKGIKWGKWKEVKIDPNMFADGQMDDLVCFEELTDYQLVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 
KNVATEGTSTQKEFEVKDPELEAQGDDMVCDDPEAGEMTSENLV 
QTAP KKKKNKGKKGLE PSQSTAAKVPKKAKTW I PEVHDQKADVS 
AWKDLFVPRP VLRALS FLGFSAPTP IQALTLAPAIRDKLDILGA 
AETGSGKTLAFAIPMIHAVLQWQKRNAAPPPSNTEAPPGETRTE 
AGAETRSPGKAEAESDALPDDTVIESEALPSBIAAEARAKTGGT 
VS DQALLFGDDDAGEGPS S L I RE KP VPKQNENEEENLDKEQTGN 
LKQELDDKSATCKAYPKRPLIiGLVIiTPTRELAVQVKQHIDAVAR 

FTG I KTAI LVGGMSTQKQQRMLNRRPE I WATPGRLWELI KEKH 
YHLRNLRQLRCLVVDEADRMVEKGHFAELSQLLEMLNDSQYNPK 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVIDLTRNEATVETLTETKIHCETDEKDFYLYYFLMQYPG 
RSLVFANS I S CIKRLSGLLKVLDIMPLTLHACMHQKQRLRNLEQ 
FARLEDCVLIATDVAARGLDI PKVQHVIHYQVPRTSE I YVHRSG 
RTARATNEGLSLMLIGPEDVINFKKIYKTLKKDEDIPLFPVQTK 
YMD WKER I RLARQ I E KS E YRNFQACLHKS W I EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSGKPPLLVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 

QPSTSAN 




•a 


895 1 


^ G P Fl*S ACRMWGACKVKVHDSLAT I S I TLRR YLRIjGATMAKS KFE 
YVRDFEADDTCLAHCWVVVRLDGRKFHRFAEKHNFAKPNDS RAL 
QLMTKCAQTVMEELEDIVIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFAS S YVFYWRDYFEDQPLLYPPGFDGRVWYP SNQT 
LKD YLS WRQADCHINNLYNTVFWAL IQQSGLTPVQAQGRLQGTL 
AADKNEILFSEFNINYNNEPPMYRKGTVLIWQKVDEVMTKEIKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5529 


48 

T 


640 


TFRIiVSAHLKTRKLINPEAAERRWRDWDSROGWIiS VKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEVVSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 

| AAAMENPNLRE I VEQCVLEPD 


5530 

-,■ . 

.- * 


4541 

1 . . •,. < 


2606 


AQIVHAISYCHKLHVGHRDLKPENWFFEKQGLVKLTDFGFSNK 

FQPGKKr»TTSCGSLAYSAPEiLLGDEYDA * 
VCGQPPFQEANDSETLTMIMDCKYTVPSHVSKECKDLITRMLQR 
DPKRRASLEE I ENHPWLQGVDPSPATKYNIPLVS YKNLSEEEHN 
S I IQRMVLGDIADRDAI VEALETNRYHHITATYFLLAERILREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQSPARAADSVIiNGHRSKGLCDSAKKDDLPEIiAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVIjNQIFEEGESDDEFDMDENLPPKI^RLKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGS PSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAGQVPAVGG I KFFSDHMADTTTELERI KSKNLKNNVLQLPLC 
j EKT I S VNIQRNPKEGLLCAS S PASCCHVT 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGTVLFARLF 
ALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAV 
TNVEDLS SLEE YLASLGRKHRAVGVKLS S FSTVGES LLYMLEKC 
| LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


5532 


3395 


1402 


SDWMWGKRKM 1 1 EDETE FCGEELLHS VLQCKS VFDVtiDGEEMR 
RARTRANP YEMI RGVFFLNRAAMKMANMDFVFDRMFTNPRDS YG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFEPYYGEGGIDGDGDITRPENISAFRN 
FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 
LSIVRTGGHFICKTFDLFTPFS VGLVYLLYCCFERVCLFKP ITS 
RPANSERYWCKGLKVGIDDVRDYLFAVNIKLNQLRNTDSDVNL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 

i. UL a L _L UI1 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=nistiaine, i=isoieucme, Ksiiysine, 
L=Leucine, M=Methionine , N«Asparagine , 
P=Proline, Q«Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *==Stop 
Codon, /=possible nucleotide deletion, 
\=pos sible nucleotide insertion) 








WPLEVIKGDHEFrDYMIRSNESHCSLQIKALAKIHAFVQDTTL 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKFFELIQGTEI 
D 1 FS YKPTLIjTS KTIjEKI rp vfdyrc mvsgseqkfliglgksqi 
YTWDGRQSDRW I KLDLKTELPRDTLLS VE IVHELKGEGKAQRKI 
SAIHILDVLVMGTDVREQHFNQRIQLAEKFVKAVSKPSRPDMN 
PIRVKEVYRLEEMEKI FVRLEMKI IKGSSGTPKLS YTGRDDRHF 
VPMGLYIVRTVNEPWTMGFSKSPKKKFFYNKKTKDSTFDLPADS 
IAPFHI CYYGRLFWEWGDGI RVHDSQKPQDQDKLS KEDVLS FIQ 


5533 


94 


789 


MKERRAPQPWARCKLVLVGDVQCGKTAMLQVLAKDCYPETYVP 
TVFENYTACLETEEQRVELSLWDTSGS P YYDNVRP LCYSDSDAV 
LLCFD I S RPETVDS ALKKWRTE I LDYC PSTRVLL IGCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
Hb 1 J? Ki AbMxjCJjNivPS PljPQKIbP VKSXjSKRLIiHIjPSRSELISPT 
FKKEKAKXCS I M 


5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVE KR Y LAAG AVT LL SLY LLFG YG AS LL CN L I G FVYP AYAS I K 
AIESPSKDDDTVWLTYWVVYALFGLAEFFSDLLLSWFPFYYVGK 
CAFIiLFCMAPRPWNGALMLYQRVVRPLFLRHHGAVDRlMNDLSG 


5535 


1029 


332 


KSPMDSEARLCSLVELSDTQDETQKSDSENEDIjKIDCLQESQEL 
NLQKLKNSER I LTEAKQKMRELTVN I KMKEDL I KELI KTGNDAK 
SVS KQYTLKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKLQKE PRKKVDAAKLRVQ VLQKKQQDSKKIiAS LS I QNEKRAN 
EL1EQ6 VDHMKYQ K IQIjQRKJjQEENE krkqldavikrdqqkikvi 
LSYIPAKYNMKC 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRAPPPSAAPLPTGRAQMSP 
SGRLCLLTIVGLILPTRGQTLKDTTSSSSADATIMDIQVPTRAP 
DAVYTELQPTSPTPTWPADETPQPQTQTQQLEGTDGPLVTOPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
F YDEOTLR KRGLL VAAVLF I TG III LTS GKCRQLS RLCRNHCR 


5537 


3 


2391 


RARVS S P QLRV FRSGR PRRLRVLR INRTS VALRIiAGTGR FVAKT 

PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 

YRNLVSLGLWSKPDIilTFLEQRKEPWNVKSEETVAIQPDVFSH 

YNKDLLTEHCTEASFQKVISRRHGSCDLENLHLRKRWKREECEG 

imSCYDEKTFKYDQFDESSVESIiFHOQlIiSSCAK^YNFD^ 

FTHSSLLNQQEEIDIWGKHHIYDKTSVLFRQVSTLNSYRNVFIG 

EKNYHCNNSEKTLNQS SSPKNHQENYFLEKQYKCKEFEEVFLQS 

MHGQBKQEQSYKCNKCVEVCTQSIiKHIQHQTIHIRENSYSYNKY 

DKDLSQSSNLRKQI IHNEEKPYKCEKCGDSLNHSLHLTQHQI I P 

TEEKPYKWKECGKVFNLNCSLYLTKQQQrDTGEKLYKCKACSKS 

FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 

E KPYKCKECGKAFNRS SCLTQHQTTHTGEKLYKCKVCSKSYARS 

bNLilMrlU K V ti JajEKP Y KCKisCGKV r SRS S CIjTQHkKIHTGENJjY 

KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 

RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 

CGKAFSYSSDVIQHRRIHTGQRPYKCEECGKAFNYRSYLTTHQR 

SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 

FSYRSYLTTHRRSHSGERPYKCEECGKAFNSRSYXIAHQRSHTR 




f 


- 


5538 


926 


161 


HSM^KIPWGSIPVLMLLLLLGLIDISQAQLSCTGPPAIPGIPG \ 
I PGTPGPDGQPGTPG I KGEKGLPGLAGDHGEFGEKGDPGI PGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDOTIRFDHVT TMMMNMVR DP <?f5TT FTPKVPrtT . WPTYWA 

SSRGNLCVNI^RGRERAQKWTFCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANS I FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSC31RPQPMARRYDEIjPHYPG 
IVDGPAAIiASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREK 

deiyghplfpllalvfekcelatcsprdgagaglgtppggdvcs 
sdsfnediaafakqvrserplfssnpeldnlviqaiqvlrfhll 
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ID 
NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELEKVHDLCDNFCHRYI TCIiKGKMPIDLVI EDRDGGCREDFEDY 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTSVAS PSSGGEDEDLDQERRRNKKRGIFPKVATNIM 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMI DQSNRTGQGAAFS PEGQP IGG YTETQPHVAVRPPGS VGMS 
LNLEGEWHYL 


5540 


148 


1440 . 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLIjPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFBDYPASCPSLPDQNKrWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSYASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLT I LQVNNWF INARRR IVQPMI DQSNRTGQGAAFS P EG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL | 


5541 


148 


1440 

• 


P P LGAGAG VHARS PH PARRLPLTTAG VGGRAPDLLP TP WRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPIjIiALVFEKCEIjATCS PRDGAGAGLGT PPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
IAQDTGLTILQVNNWFINARRRIVQPMIDQSNRTGQGAAFSPEG 
Q P I GG YTE TEPHVAFRAP AS VGDE FGTRKEEWHYL 


5542 


148 

f 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYTBLPHYPGIVD 
GPAALASFPET7PAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCEIATCSPRDGAGAGIjGTPPGGDVCSSDS 

fnedntafakqvrserplfssnpeldnlmiqaiqvlrfhllele 
kgkmpidlviedrdggcredfedypascpslpdqnniwirdhed 
sgsvhlgtpgpssgglasqsgdnssdqgvgldtsvaspssgged 
edldqeprrnkkrgi f pkvatn i mrawlfqhlshp ypseeqkkq 
laqdtgltilqvnnwfinarrrivqpmidqsnrtgqgaafspeg 

-QPIGGYTOTEPJTVftFJ^PffiV^ * 


5543 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP 
KRPFS DSGAFWS PERRPG VLEAPRRR PVPAS FRAVP PKPTRVHG 
S S AS RDRVLARTMI VADSE CRAELKD YLRFAPGGVGD SGPGEEQ 
RESRARRG PRGPSAFI PVEEVLREGAESLEQHLGLEALMSSGRV 
DNIAWMGLHPDYFTSFWRLHYLLLHTDGPLASSWRHYIAIMAA 
ARHQCSYTiVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHRPWLITKEHIQALLKTGEHTWSIiAELIQALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLIjRDEGTSQEEMESRFELEKSESIjL 
VTPSADILEPSPHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTS VLRRAIWNY IHCVFG I RYDDYD YGEVNQLLERNLKVYI KTV 
AC YPE KTTRR^fYNLFWRHFRHSEKVHVNIlLLLEARMQAALLYA^» 
RAITRYMT 


5544 


1895 


514 


LGGLLGRQRLLLRMGAGRLGAPMERHGRASATSVSSAGEQAAGD 
PEGRRQE PLRRRAS S ASVPAVGASAEGTRRDRLGS YSGPTS VSR 
QRVESLRKKRPIiFPWFGLDIGGTLVKLVYFEPKDITAEEEEEEV 
ESLKSIRKYLTSNVAYGSTGIRDVHLELKDLTLCGRKGNLHFIR 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRDIYGGDYERFG 
LPGWAVAS S FGNMMS KEKREAVS KEDLARATL I T I TNN IG S 1 AR 
MCALNENINQWFVG^FLRIWTIAMRIJjAYALDYWSKGQLKALF 

SEHEG YFGAVGALLELLKI P 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DaAspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, Iolsoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=sArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHSHDIKYGSGSGQQSVTGVEASDElANSYWRIRG 
GS EGGCPRGS P VRCGQAVRLTHVLTGKNLHTHHFPS PLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSVFLSVT 
GEQYGS PIRGQHEVHGMPSANTHNTWKAMEGI F IKPSVEPS AGH 
DEL 


5546 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHSFVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTIKRCLLIDYNPDSQELDFRHYSIKWPVGASRGMK 
KLLQEKFPNMSRLQDISELIATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQS AVRLTE I GPRMTLQLI KVQEGVGEGKVMFHS 
FVS KTEEELQAI LBAKEKKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
0)QKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5547 


1592 


146 


FV P RGGHS S MGQS GRS RHQKRARAQAQLRNLE AYAAN PHS FVFT 
RGCTGRWIRQLSLDVRRVMEPLTASRLQVRKKNStiKDCVAVAGP 
LGVTHFLILSKTETflrVYFKLMRLPGGPTLTFQVKKYSLVRDVVS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTI KRCLLIDYNPDSQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTE IGPRMTLQLI KVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASG1PSRTASLBLGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRG PRGASRDGGRGRGRGRPGKRVA 


5548 

»» . - 


1 

s ' 

• \ ~ . - *' . • t ■ • »» 


2153 

*- * 


DQTG P PETIAFTF P RSTME PLCPLLLVG FSLP LARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFELANKEEN 
REKNR YPNILPNDHSRVILSQLDG I PCS DYINAS YJDGYKEXNK 
FIAAQGPKQETVNDFWRMVWEQKSAT I VMLTNLKERKEEKCHQY 
WPDQGCTmTGNIRVCOTDCVVLVDYTIRKFCIQPQLPIXJCKAPR 
• LVSQMFTSWPOTGVPPTFIGMnCFLTClCVKT^ 
SAGVGRTGTFI VIDAMMAMMHAEQKVDVFEFVSR IRNQRPQMVQ 
TDMQYTF I YQALLE YYLYGDTELDVS SLEKHLQTMHGTTTHFDK 
IGLEEEFRKLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQEREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLSEAISIRDFLVTLNQPQARQEEQVRWRQFHFHGWPEIG 
I PAEGKGM IDLI AAVQKQQQQTGNHP I TVHCS AGAGRTGTFIAL 
SNILERVKAEGLLDVFQAVKS LRLQRPHMVQTLEQYEFCYKWQ 
DFIDIFSDYANFK 


5549 


915 


256 


F EATGGKR LAFKMAG TARHDREMA IQAKKKLTTATD P I ERLRLQ 
CLARG SAG I KGLGR VFR I MDDDNNRTLD FKEFM KGLNDYAWME 
KEEVEELFQRFDKDGNGTIDFNEFLLTLRPPMSRARKEVIMQAF 
RKLDKTGDGV IT IEDLREVYNAKHHPKYQNGEWSEEQVFRKFLD 
NFDSPYDKDGLVTPEEFMNYYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
WAMKCQYVQADVLDLAETMVASADGLVYEPTVFDLSPQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFHITVGKAIPHPRGHAHLAALVNHESYN 
FSHRIDHLSFGELVPAIINPLDGTEKIAIDHNQMFQYFITWPT 
KLHTYKISADTHQFSVTERERIINHAAGSHGVSGIFMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGGIFSTTGMLHGIGKFIVEIIC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 


1700 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 
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ID 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, "^Threonine, V» Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKIPAKRIFGDNFDPDFIK 

QRRAGLNEFIQNLVRYPELYNHPDVRAFLQMDSPKHQSDPSEDB 

DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 

LLAKRKLDGKFYAVKVLQKKIVTiNRKEQKHIMAERNVL^^ 

PFLVGLHYSFQTTEKLYFVLDFVNGGELFFHLQRERSFPEHRAR 

FYAAE IASALGYLHS I KIVYRDLKPENIIiLDSVGHWLTDFGLC 

KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 

EMIjYGLPPFYCRDVAEMYDNIIjHKPLSLRPGVSLTAWSILEELL 

EKDRQNRLGAKEDFLE IQNHPFFESLSWADLVQKKI P PPFNPNV 

AGPDDIRNFDTAFTEETVPYSVCVSSDYSIVNASVLEADDAFVG 

FS YAPPS EDLFL 


5552 


2748 


930 


LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPLEKPLKLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
P VRACRTQ P AENE ST P I QQLLE H FLRQ LQRKDPHG F FAF PVTDA 
I APGYSM 1 1 KHPMDFGTMKDKI VANEYKS VTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKILHAGFKMMSKQAALLGNEDTAVEEPVP 
EWP VQVETAKKS KKPSRE V I S CM FEPEGNACSLTDS TAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLIiYSWNTAEP 
DADEEETHPVDbSSLSSKLLPGFTTLGFKDERRNKVTFIiSSATT 
AliSMQNNSVFGDLKSDEMELIiYSAYGDETGVQCALSLQEFVKDA 
GS YS KKWDDIiLDQI TGGDHSRTLFQLKQRRNVPMKPPDEAKVG 
DTLGDSS S S VLE FMS MKS YPD VS VD I SMLS SLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDQHHL 
GSPSRLSVGEQPDVTHDPYEFLQS PEPAASAKT 


5553 


74 

• 


1095 


LGREAVYLVSRMDGPVAEHAKQEPFHWTPLLESWALSQVAGMP 
VFLKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLGIPATIVLPESTSLQWQRLQGEGAEVQLTGKVWD 
EANLRAQELAKRDGWENVP PFDHPLI WKGHASLVQELKAVLRTP 
PGALVLAVGGGG LLAG VYAGLLE VGWQHVP 1 1 AMETHGAHCFNA 
A I T AGKLVTLPD I TSVAKS LGAKTVAARALECMQVCKIHS EWE 
DTEAVSAVQQLLDDERMLVEPACGAALAAI YSGLLRRLQAEGCL 
PPSLTSWVIVCGGNNINSRELQALKTHLGQV 


5554 


166 


2318 


CSGRTGGRGSIiRPAENVCLTCKliSGAETRGLIiCPALRTWlMKVL 
GRS FFWVLFPVLPWAVQAVEHEEVAQRVI KLHRGRGVAAMQSRQ 
.WVRDSGRKLSGLLRQKNAVIiOT^^ 

VHTFE I FQKELNESENSVFQAVYGLQRALQGD YKDWNMKESSR 
QRLEALREAAIKEETEYMELLAAEKHQVEALKNMQHQNQSLSML 
DE I LEDVRKAADRLEEE IEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANS KQN I TKRE VE0DLGLS ML I DS QNNQ YILTKPRDSTI PRADH 
HF I KDI VT IGMLS L PCG WL CTAIGLPTM FGY 1 1 CGVLI/5P SGLN 
SIKSIVQVETLGEFGVFFTLFIjVGLEFSPEKLRKVWKISLQGPC 
YMTLLM I AFGLLWGHLLRI KPTQS VF I S TCLS LS STPLVS RFLM 
GSARGDKEGDIDYSTVLLGMIiVTQDVQLGLFMAVMPTLIQAGAS 
ASSSIWEVLRILVtilGQILFSLAAVFLLCLVIKKYLIGPYYRK 
LHMES KGNKE I L ILGISAFI FLMLTVTE LLDVS MELGCFLAGAL 
VSSQGPWTEEIATSIEPIRDFLAIVFFASIGLHVFPTFVAYEIj 
TVLVFLTLSVVVMKFIjLAALVLSIjILPRSSQYIKWIVSAGIiAQV 
S E FS FVLG SRARRAG VI S RE VYLLILS VTTLSLLLAP VLWRAAI 
TRCVPRPERRSSL 


5555 


212 


1425 


LSLRTRETPAP PRCEAASQGRVGWRADAAAEEAVRS VWNRTRDR 
G TMAPQNLS TFCLLLL YX IGAVI AGRDFYKIIiG VPRSAS IKDXK 
KAYRKIALQLHPDRNPDDPQAQEKFQDLGAAYEVIjSDSEKRKQY 
DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPRQQDRNTPR 
GSDI IVDLEVTLEEVYAGNFVEWRNKPVARQAPGKRKCNCRQE 
MRTTQIiGPGRFQMTQEVGCDECPNVKLVNEERTLEVEIEPGVRD 
GMEYPFIGEGEPHVDGEPGDLRFRIKWKHPIFERRGDDLYTNV 
TISLVESLVGFEMDITULDGHKVHISRDKITRPGAKLWKKGEGL 
PN FDNNN I KG S L 1 1 TFDVDFP KEQLTE EAREG I KQLLKQG? VQK 
VYNGLQGY 


5556 


5835 


3346 


RTRGMSKNCVPMEFEEYLLRMFQGTFYLLQKITKDNNAHTVKSR 
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ID 
NO: 


Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine f N=Asparagine, 
P=Proline, Q«Glutamine, R»Arginine, 
S«Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEELDESYIEKFTDFLRLFVSVHLRRIESYSQFPWEFLTLLFK 

YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 

DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYLR 

QSLEWAKVMELLPTHAFSTLFPVLQDNLEVYLGLQQFIVTSGS 

GHRLNITAENDCRRLHCSLRDLSSLLQAVGRLAEYFIGDVFAAR 

FNDALTWERLVKVTLYGSQ I KLYNI ETAVPS VLKPDL I DVHAQ 

SLAALQAYSHWLAQYCSEVHRQNTQQFVTLISTTMDAITPLIST 

KVQDKLLLSACHLLVSLATTVRPVFLIS I PAVQKVFNRI TDASA 

LRLVDKAQVLVCRAIiSNILLLPWPNLPENEQQWPVRS inhasli 

SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVLEDIVENI 

SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMIjSFFIi 

tlfrgkrvomgvpfteoi iotflnmftreolaes ilhegstgcr 

WEKFLKI LQVWQEPGQVFKPFLPSI IALCMEQVYP I IAERPS 

PDVKAELFELLFRTLHHNWRYFFKSTVLASVQRGIAEEQMENEP 

QFSAIMQAFGQSFLQPDIHLFKQNLFYLBTLNTKQKLYHKKIFR 

TAMLFQFVNVLLQVLVHKSHDLLQEEIGIAIYN^4ASVDFDGFFA : 

AFLPEFLTS CDGVDANQKSVLGRNFKMDRVRRERGRAKRRAEWA 

RKPGTCAARRGHIEASGRGLCPPCSIiAAAHEMPADLVL 


5557 


1712 


491 


VILGAGLRDKDMWIPWGLPRRLRIjSALAGAGRFCILGSEAATR 
KHLPARNHCGLSDSSPQIjWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQIYLGKPSRPPHLLLECNPGPGILTQALLEAGAKW 
ALESDKTFIPHLESLGKNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMSSRGLFKNLGIEAVPWTADIPLKWGMFPSRGEKRALWKLAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKLY 
LIQMIPRQNIiFTKNLTPMNYNIFFHLLKHCFGRRSATVIDHLRS 
LTPLDARDILMQIGKQEDEKWNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 

t 


1509 

* 1 


96 

* 


RAGCTHPQVPADLGAPAE PRR PQKTCVCLLQPQPGGQRGPTTMI 
TGVFSMRLWTPVGVLTSIAYCLHQRRVALAELQEADGQCPVDRS 
LIiKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD * 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVEDIPFLSPTFNPOEVFrRSTNIFRNLiESTRCIjlAA 

imi *\ xj i\ i vxm x v i_jxs jju 4 ii x yu v x a. x xi *x x ivx* i trt w x tv\^ in iri 

GLFQCQKEGP 1 1 IHTDEADSEVLYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGXDSSDKVDFFILLDNVAAEQAHNIiPS t 
' C$ltola*FXRMlEQR^^ 
NLIjKAMDSATAPDKI RKLYLYAAHDVTFIPLLMTLG I FDHKWPP 
FAVDLTMELYQHLESKEWFVQLYYHGKEQVPRGCPDGLCPLDMF . 
LNAMSVYTLS PEKYHALCSQTQVMEVGNEE 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 

ELDWDPDGSVPVGLRQRNQTEKQSTGVYNREAMLNFCEKETKK 

LMQREMSMDESKQVETKTDAKNGBERGRDASKKALGPRRDSDLG 

KE P KRGGLKKS FS RDRDEAGGKSGE KP KEEKI I RG I DKGRVRAA 

VDKKEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 

MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 

VKRGTGNTDTKKDDEKVKKNEPLKEKEAKDDS KTKTPEKQTPSG 

PTKPS EGPAKVEEEAAPS I FDEPLERVKNNDPEMTE VNVNNS DC 

ITNE I LVRFTE ALEFNTVVKLFALANTRADDHVAFAI AIMLKAN 

KTITSLNLDSNHITGKGIIiAIFRALLQNNTLTELRFHNQRHICG 

GKTEMEIAKLLKENTTLLKIiGYHFEIAGPR 

RQKRLQEQRQAQEAKGEKKDLLEVPKAGAVAKGSPKPSPQPSPK 

PSPKNSPKKGGAPAAPPPPPPPLAPPLIMENLKNSLSPATQRKM 

GDKVLPAQEKNSRDQLLAAIRSSNLKQLKKVEVPKLLQ 


5560 


9 


921 


SSWEFSALSVSMACLS PSOLOKFOODGFLVLEGFLSAE ecvam 

qqrigeivaemdvplhcrtefstqeeeqlraqgstdyflssgdk 
irfffekgvfdekgnflvppeksinkighalhahdpvfksiths 
fkvqtlarslglqmpvwqs m y i fkqphfggevs phqdas flyt 

EPLGRVLGVWIAVEDATLENGCLWFI pgshtsgvsrrmvrapvg 

sapgtsflgsepardnslfvptpvqrgalvlihgewhkskqnl 
sdrsrqaytfhlmeasgttwspenwlqptaelpfpqlyt 


5561 


2175 | 1775 


cyfifqffsspypglhphqtpaplpnpglypppvsmspgqpppq 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of ' 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H^Histidine, I^Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








QLLAPTYFSAPGVMNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYGGVTYYNPAQQQVQPKPSPPRRTPQPVTIKPPPPEWSRGS 
S 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGLKTVGVLSQADYIiNLVQCETLEDLKLH | 
LQSTDYGNFLANEASPLTVSVIDDRLKEKMVVEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILL.ITGTLHQRSIAELVPKCHPLGSFE 
QMEAWIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIEI 
IRNTLYKAYLE SFYKFCTLLGGTTADAMCP I LEFSADRRAF I IT 
INS FGTELS KEDRAKLFPHCX3RLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIWIAECIAQRHRAKIDNYIPIF 


5563 


342 


1385 


SSGKNJD^3AAAGAAGLVRGLKAGVLSQADYI^LVQCETLBDLKLH 
LQSTDYGNFLANEASPLTVSVIDDRLKEKMVVEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPIiGSFE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIBI 
IRNTLYKAYLE S FYKFCTLLGGTTADAM CP I LE FEADRRAF 1 1 T 
INS FGTELSKEDRAKLFPHCGRLYPEGLAQLARADD YEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALLL 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSRLACLAGELRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCGTNEILPEGDATTMGPPVTLESVTSLRNATTM 
GPPVTLES VPS VGNATS S SAGDQSGS PTAYGVI AAAAVLS AS L V 
TATLLLLSWLRAQERLRPLGLLVAMKES LLLSEQKTSLP 


5565 


993 


138 


RWNS PNPARAGS I SRPQRAPGSVSAVAMTAAVFFGCAFIAFGPA 
LALYVFTIATEPLRIIFLIAGAFFWLVSLLISSLVWFMARVIID 
NKDGPTQKYLLIFGAFVSVY IQEMFRFAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 
GDSPQFFIiYSAFMTLVIILLHVFWGIVFFDGCEKKKWGILLIVL 
LTHLL VS AQT F I S S YYG INLASAFI I LVIjMGTWAF LAAGGS CRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVS WMI S RAWLVFGMLY P AY Y S YKAVKT 
KNVKEYVRWMMYWIVTALYTVIEWADQTVAWFPLYYELKIAFy 
-IWLLSPYTKGASLraOCFLHPto '* 
MVNFGRQGLNLAATAAVTAAVKSQGAI TERLRSFSMHDLTTIQG 
DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 
EGP Y SDNEMLTKKGPRRSQ SMKS VKTTKGRKEVR YG S LKYKVKK 
RPQVYF 


5567 


1554 


233 


EFLGSGVS PDLANEDGLTALHQCCIDDFREMVQQLLEAGANINA 
CDSECWTPLHAAATCGHLHLVELLIASGANIiLAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITQDSIEAARAVPELRMLDDIRSRLQ 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
GWEPLHAAAYWGQVPLVELLVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLLELKHKHDALLRAQS RQRSLLRRRTSSAGSRGKWRR V 
SLTQRTDLYRKQHAQEAIVWQQP PPTS PEPPEDNDDRQTGAELR 
PPPPEEDNPEWRPHNGRVGGSPVRHLYSKRLDRSVSYQLSPLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETAEP 
GLPGDTVTPQPDCGFRAGGDPPLLKLTAPAVEAPVERRPCCLLM 


5568 


1731 


587 


AEDRQPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL 
SLLVSGPRLFLLQQPLAPSGLTLKSEALRNWQVYRLVTYI FVYE 
NP I SLLCG AI 1 1 WRFAGNFERTVGTVRHC FFTVI F AI FS AI I FL 
S FEAVSSLS KLGEVEDARGFTPVAFAMLGVTTVRSRMRRALVFG 
MWPSVLVPWLLLGASWLIPQTS FLSNVCGLS IGLAYGLTYC YS 
I DLSER VAL KLDQT F P FS LMRR I S VFKYVSGS S AERRAAQSRKL 
NPVPGS YPTQS CHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNS PGIVY S GALGT PGAAGSKES SR VPMP 


i 5569 


2 | 835 


1 QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G»Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M~Methionine , N=Asparagine , 
P= Pro line, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGEPG I PAIPGIRGPKGQKGEPGLPGHPGKNGPMGPPGMPGVPG 
PMGIPGEPGEEGRYKQKFQSVFTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVIiLYRSGVK 
WTFCGHTSKTNQWSGGVIJjRLQVGEEVWLAVND YYDMVG IQG 
SDSVFSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTDWKLIESKHEVTILGGLNEFWKFYGPQGT 
P YEGGVWKVRVDLPDKYP FKS PS IGFMNKI FHPN IDEASGTVCL 
DVINQTWTALYDriTNIFESFLPQLLAYPNPIDPIiNGDAAAMYLH 
RPEB Y KQ KI KB Y I QKYATEEALKEQ E EGTGDS S SESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MS S PS PGKRRMDTDWKLI ESKHEVTI LGGLNEFWKFYGPQGT 
PYEGGWKVRVDLPDKYP FKS PS IGFMNKI FHPNIDEASGTVCL 
DVINQTWTALYDLTNIFESFLPQLIiAYPNPIDPLNGDAAAMYIiH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLLGAYRLWV 
RWGRRGLGAGAGAGEESPATSLPRMKKRDFSI»EQI*RQYDGSRNP 
RI LLAVNGKVFDVTKGS KFYGPAGP YGI FAGRDASRGLATFCLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLIiKPGE 
EPSEYTDEEDTKDHNKQD 


5573 

• 


2562 

• 

4 ■ 

• 

• 


219 

* 

****** . * ' * O " 1 ■* * ' 


VPART PNAEDQG P EARAATAT PCQSGGRERAGEAAEDGVKMAAF 
SEMGVMPE IAQAVEEMDWLLPTDIQAES IPLILGGGDVLMAAET 
GSGKTGAFS I P VI Q I VYETLKDQQEGKKGKTT I KTGASVLNKWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGLMKGKHYYE 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNKQFD 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDIjGLAFEIPPHMKN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK ! 
SQHSGNAQVTQTKFLPNAPKALIVEPSRELABQTLNNIKQFKKY 
IDNPKLRELLI IGGVAARDQLSVLENGVDI WGTPGRIiDDLVST 
GKLNtiSQVRFLVLDEADGLLSQGYSDFINRMHNQIPQVTSDGKR 
LQVIVCSATLHSFDVKKLSEKIMHFPTWVDLKGEDSVPDTVHHV 
VVPVNPKTDRLWERLGKSHIRTBDVHAOT 

IKILKGEYAVRA1KEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
RGIDIHGVPYVINVTLPDEKQNYVHRIGRVGRAERMGLAISLVA 
TEKEKVWYHVCSSRGKGCYNTRIiKEDGGCTIWYNEMQLLSEIEE 
HLNCTISQVEPDIKVPVDBFDGKVTYGQKRAAGGGSYKGHVDIli 
APTVQEIiAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQPEDKGAVPEDASTERSAMASLGLQLVGYILG 
LLGLLGTLVAMLLPS WKTSSYVGAS IVTAVGFSKGLWMECATHS 
TGITQCDIYSTLLGIiPADIQAAQAMMVTSSAISSLACIISWGM 
RCTVFCQE S RAKDRVAVAGGVFF I LGGLLGF I PVAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGI ISSLFSLIAGI ILCFSCSCQRN 
RSNYYDAYQAQPLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LLWALPCPPPTAAAVLI^STGLMELIjEKMIiALTIuAKADSPRTAIi 
LCSAWLLTASFSAQQHKGSLQKDPLLSQACVGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFVLFLFLLHRDVSSR 
EEATEKPWLKSLVSRKDHVLDLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
S KWTPLETQEKEEGYKKHCFNAFASDRI SLQRS LGPDTRPPECV 
DQKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLK 
E I ILVDDASTEEHLKEKLBQYVKQLQWRWRQEERKGLITARIi 
LGASVAQAEVLT FLDAHCE CFHGWLEPLLARI ABDKTWVS PD I 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWBTLPPHEKQRR 
KDETYP I KS PTFAGGLFS I SKS YFEHI GTYDNQME I WGGENVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, GsGlycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine , V»Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRWQCGGQLEI I PCS WGHVFRTKSPHTFPKGTSVIARNQVR 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSPGDISERLQLREQLHC 
HNFSWYLHNVYPEMFVPDLTPTF YGAI KNLGTNQCLDVGENNRG 
GKPLIMYSCHGIX3GNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCHFTGKNSQVPKDEEWEIiAQDQLIRNSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV* 


5577 


3 

■ 


1275 

* 


RNS DCS CGE I S VHCLP W VLF I LDLKVE S SMF CP LKL I LLP VLLD 
YSLGLNDLNVSPPELTVHVGDSALMGCVFQSTEDKCI FKIDWTL 
SPGEHAKDEYVLYYYSNLSVPIGRFQNRVHLMGDILCNDGSLLL 
QDVQEADQGTYICB IRLKGESQ VFKKA WLHVLPEEPKELMVHV 
GGLIQMGCVFQSTEVKHVTKVEW I FSGRRAKEEI VFRYYHKLRM 
S VE YSQS WGHFQNRVNLVGD I FRNDGS I MLQGVRESDGGNYTCS 
IHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVLGGNQLVI IV 
GIVCATILLLPVLILIVKKTCGNKSSVNSTVLVKNTKKTNPEIK 
EKPCHFERCEGEKHIYSPIIVREVIEEBEPSEKSEATYMTMHPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 

• 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDFSSFRALLEPELRPEDRILVLGCGNSALSYELFLGGFPNV 
TS VD YSS VWAAMQARYAHVPQLRWETMD VRKLDFP S AS FDWL ! 
EKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVSRVLVPGGRFI 
SMTS AAPH FRTRH YAQAYYGW S LRHATYGS GFHFHLYLMHKGGK 
LS VAQLALGAQ I LSPPRP PTS PCFLQDSDHEDFLS AI QL 


5579 


3 


1540 

4 


RNSGLARGASALARHGGGLAGGVGWDCGACASRCQGVMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLLSRVFQPQNL 
REDRVLS LQDKSDDLTCKSQRLMLQVGLI YPASPGCYHLLPYTV 
RAMEKLVRVIDQEMQAIGGQKVNMPSLSPAELWQATNRWDLMGK 
ELLRLRDRHGKEYCLGPTHEEAITALIASQKKLSYKQLPFLLYQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
DAYCSLFNKLGLPFVKVQADVGTIGGTVSHEFQLPVDIGEDRLA 
ICPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTFYLGT 
KYS S I FNAQFTNVCG KP TLAEMG C YGLGVTRI LAAA I EVLS TED 
CVRWPSLLAPYQACLIPPKKGSKEQAASELIGQLYDHITEAVPQ 
LHGE VLLDDRTHLT IGNRLKDANKFG YP FV 1 1 AGKRALEDPAHF 
EVWCQNTGEVAFLTKDGVMDLLTPVQTV 


5580 

, - » 


1681 

„«"■* . . .. 


450 


ADAGTRC I PGF WPSGAG YS APAQRGRRSSGRMRAAAAPGLTAP 
vWRLI^QCCELEAGEEGMAVPAAAMGFSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELFRGHSKTREFLAHSAKVHSVAWSCDGRRLASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKT IR I WDVRTTKC IATVNTKGENINI CWS PDGQT IAVGNK 
DD WTF IDAKTHRSKAEEQF KFEVNE I S WNNDNNMFFLTNGNGC 
INILS YPELKP VQS INAHPSNCI CI KFDPMGKYFATGSADALVS 
LWDVDELVCVRCFSRLDWPVRTLSFSHDGKMLASASEDHFIDIA 
EVETGDKLWEVQCESPTFTVAWHPKRPLIAFACDDKDGKYDSSR 
EAGTVKLFGLPWDS 


5581 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNSPSYAPEFQFLHSAYATLLMKQAWPQNSSSCGTEG 
TFHLPVDTGTENRT YQAS SAAFRYTAGTPYKVP PTQSNTAPP PY 
SPSPWPYQTAMYPIRSAYPQQNLYAQGAYYTQPVYAAQPHVIHH 
TTWQPNS IPSAI YPAPVAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5582 


5775 


2739 


I ITNNNNVI I PLVI AYHLSGSAQARGERS PAERLMERQKRKAD I 
EKGLQF I QSTLPLKQEEYEAFLLKLVQNLFAEGNDLFRE KDYKQ 
ALVQYMEGLNVADYAASDQVALPRELLCKLHVWRAACYFTMGLY 
EKALEDSEKALGLDSESIRALFRKARALNELGRHKEAYECSSRC 
SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGS IDD I ETDCYVDPRGSPALLP STPTMPLF 
PHVLDLIAPLDS S RTL PSTDS LDDFSDGDVFG PELDTLLDSLS L 
VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
Hs=Hi s t idine , 1=1 soleuc ine , K=Lysine , 
L=Leucine, M=Methionine, N=Asparagine # 
Ps= Proline, Q=Glutamine, R=*Arginine, 
SaSerine, ^Threonine , V«Valine, 
W*»Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DS FGSTRGS LDKPDS FMEETNSQDHRPPSGAQKPAPS PEPCMPN " 

TALLIKNPLAATHEFKQACX3LCYPKTGPRAGDYTYREGLEHKCK 

RDILLGRLRSSEDQTWKRIRPRPTKTSFVGSYYLCKDMINKQDC 

KYGDNCTFAYHQEEIDVWTEERKGTLNRDLLFDPLGGVKRGSLT 

I AKLLKEHQGI FTPLCE I CFDSKPRI ISKGTKDSPS VCSNLAAK 

HSFYNNKCLVHIVRSTSLKYSKIRQFQBHFQFDVCRHEVRYGCL 

REDSCHFAHSFIELKTOLLQQYSGMTHEDIVQESKKYWQQMEAH 

AGKASSSMGAPRTHGPSTFDLQMKFVCX3QCWRNGQWEPDKDLK 

YCSAKARHCWTKERRVLLVMSKAKRKWVSVRPLPS IRNFPQQYD 

LCIHAQNGRKCQ YVGNCS FAHSPEERDMWTFMKENKI LDMQQT Y 

DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 

OGKNSNSKKQWQQHIQSEKHKEKVFTSDSDASGWAFRFPMGEFR 

LCDRLQKGKACPDGDKCRCAHGQEEIiNEWLDRREVIiKQKLAKAR 

KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
I KKAYRKLALKYHPDKNPDEGE KFKL I SQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLSVTLEDL YNGVTKKLALQKNV I CEKCEGVGGKKGSVE KCPL 
CKGRGMHIHI QQIGPGMVQQIQTVCI ECKGQGERINPKDRCESC 
SGAKVI REKKI I EVHVEKGMKDGQKI liFHGEGDQEPELEPGDVI 
IVIiDQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRl 
LVITSKAGEVIKHGDLRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PEKHWLSLBKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYE EDEDGPQAGVQCQTA 


5584 


3 


1265 

< 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
I KKAYRKLALKYHPDKNPDEGEKFKLI SQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
mbSmLEDLYmVTKKLXLQKWICEKCEGVGGKKGSVEKCPh 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKX I EVHVEKGMKDGQKI LFHGEGDQEPELEPGDVT 
I^DQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVI TS KAGEVI KHGDLRCVRDEGMP I YKAPLEKG I LI IQFLVI F 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 

■ X . 


915 

i 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAQ'MLCQRHRRKS ' 
SVTDSFSSLVNRPTLGQFTSBEIlU^VCYAKCLLQmALTFLQD ' 
ENMVS FI KGG I KVRNS YQTYKELDSLVQS SQYCKGENHPHFEGG , 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHS FRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CTWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANE KKI KYDHYLI PNALLELALLLMEQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5586 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYAT ILEMQAMMTFDPQD I LLAGNMMKEAQMLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLE FVG FSGNKD YGLLQLEEGAS 
GHSFRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKI KYDHYLI PNALLELALLLMEQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SSAVPDGAVGRP VAVAVGGPPHS CRCRPCCLMAAI GVHLGCTSA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

xx XX £ — +. i Ji z v. a t _ T es a^^r^*^\ rto \C— T ,VQ 1 np 

H=HlS uluine, 1=1 SOlcUCiJic , jv— ijyaj.ua , 

L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, R=*Arginine, 
S«Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 


• 


CVAVY KDGRAG WANDAGDRVTPAWAYSENBE I VGLAAKQS RI 
RNI SNTVMKVKQILGRS S SDPQAQKY I AESKCLVI EKNGKLRYE ] 
IDTGEETKFVNPEDVARLIFSKMKETAHSVLGSDANDWITVPF 
DPGEKQKNALGEAARAAGFNVIiRL I HE PSAALLAYGIGQDS PTG 
KSNI LVFKLGGTSIiSLS VMEVNSG I YRVLSTNTDDNI GGAHFTE 
TfjAQYb&S HFQk S FKHDVRGNAKAMnivij TJN^A-Ci VAJxnbbb 1 IAjo | 
ANCFLDSLYEGQDFDOS^SRARFEIjLCSPLFNKCIEAIRGLjLDQ 
NGFTADD INKWLCGGS SR I P KLQQIi I KDLFPAVELIjNS I P PDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDILVKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
EETKFAQVVIjQDLD Kj^MjIiKJJ J.I1B.V ±j l MJ\K1ajo Jjxi v 1 1 uya 1 
GKCEAISIEIAS 


5588 


3 


589 


TPPPPEQAMVAATVAAAWLLLWAAACAQQEQDFYDFKAVNIRGK 
LVS LE KYRGS VS L WNVAS E CG FTDQH YRALQQLQRDIX3 PHH FN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAF KYLAQT SG KEPTWN F W KYLVAP DG KWGAW D P TVS VE EV 
RPQITALVRKLILLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRSPFSSFLPRPICLSLEARPCSIEDRRNWSLIGRPGAPAS 
GLNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGK/^LCALLIi 
ATIjGAAGQPLGGE S ICSARAPAKYS I TFTGKWSQTAFPKQYPLF 

nnnn^f.TnnT t y-iTt 7\ xtc f>7~>\sc i\zr»7T5 VK7/TVT T O X77~* T DnUB CD/t!I7 , &Gl72V T.M 

RPPAQWSSIjIjGAAHSSDx £jM WKixJVU * v is WijUKi^r^uiKtjja^WAAJirj 
KEIEAAGEALQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVSF 
WRIVPSPDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGF 
TFSSPNFATIPQDTVTEITSSSPSHPANSFYYPRLKALPPIARV 
TLLRLRQSPRAFIPPAPVLPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEE71ECVP 

DNCV 


5590 


72 


896 


LCSSGALRljLPAMVAWRSAFljVLuAr fa JjAI ljvyKij£>ijt)r uuriM jj 
EDAVKETSSVKQPWDHTTTTTTNRPGTTRAPAKPPGSGIiDLADA 
LDDQDDGRRKPG1GGRERWNHVTTTTKRPVTTRAPANTLGNDFD 
LADALDDRNDRDDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYGSNDDPG SGMVAE PGT I AGVAS ALAMAL IGAVS S Y I S Y 
QQKKFCFS IQQGIiNADYVKGENLEAWCEEPQVKYSTLHTQSAE 
PPPPPEPARI 


* 5591 


: 68 


1494^ 


AGSSRRAAAERIiLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA 
LRVTRNS KINAENKAKINMAGAKRVPTAPAATSKPGLRPRTALG 
D IGNKVS EQLQAKMPMKKEAKPSATGKVI DKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPIliVDTASPSPMETSG 

A*n*nnnT /tA«noTMrTT A T 7VT7-M TTt 7\ crVTITi Tl DXTr.r^C WV^/Vn T VS VTi 
t^fy PfitfKTjTtCQAFSuVlLAVNDVU I V AJUX XriX U 

RQI1EEEQAVRPKYI1LGREVTGNMRAIL I DWIiVQVQNKFRLLQET 
MYMTVSI IDRFMQNNCVPKKMLQLVGVTAMFIASKYEEMYPPEI 
GDFAFVTDNTYTKHQIRQMEMKILRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTl^KYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
ILDNGEWTPTLQHYLSYTEESLLPVMQHLAKWAAMVNQGLTKHM 

rm r wr W 7\ I* O VtI7\ VTO T»T OAT MOAT AJOTYT . A Tf A VJVKV 


5592 


242 


924 


YGESKDWNQKDLLSALVIjTTVNCLPTPIMAKSAEVKLAIFGRAG 
VGKSALVVRFLTKRFI WEYDPTLESTYRHQATI DDEVVSME ILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKXPKNVTLILVGNKADLDHSRQVSTEEGEKIiATELACAFYECS 
ACTGEGN I TE I FYELCREVRRRRMVQGKTRRRS STTHVKQAINK. 
MLTKISS 


5593 


3 


1113 


" HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
i ccr'T^n^nT.^Fn^T.Tfn^VTTRnFRDPF.EEHELPVl^METINLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCTjRQNLIKCIENLEELQSLR 
ELDLYDNQI KKIENLEALTELE I LDI SFNLLRNI EGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKIEGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

x l lull 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

1— w J. X J- E» l_ 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, <3=Glycine, 

U.I] i eKi ha T — . T c? m m a \T ~~ T n f o ^ a 
n-filbLlQl Ilfci , X-J-BOlcUClllC/ J\ — Xiy B Xlifci , 

Xl — XjCULjLIIc , VI — I'ltJ LUXUUXllc , Vi = r\OUaI ciy XI1C , 

P«Proline, Q=Glut amine, R~Arginine, 
S«Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=pos sable nucleotide deletion, 
\=possible nucleotide insertion) 








MLALPSVRQIDATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEBDPEEEHELPVDMETINLDRDAE 
U V U Lain i K avj K.1 iiVj r B V Jjixtw J\ IIjUjRUNIjIIvCII^uBBIjQoIjK 
ELDLYDNQ IKKI ENLEALTELE ILDIS FNLLRN I EGVDKLTRL.K 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAXENIDTLTNLE 
S ZjFLGKNK I TKLQNIJDALTNLTVLS MQSNRLTKI EGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIEMISHLTELQ 
EFWMNDNIxLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPS VRQ I DATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGE^QRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LG I PTVPGKVTLQKDAQNL IGIS IGGGAQ YCPCLYI VQVFDNTP 
AALDGTVAAGDE I TGVNGRS I KGKTKVEVAKM I QEVKGEVT I HY 
NKLQADP KQGMS LDIVLKKVKHRLVENMSSGTADALGLSRAI LC 

KrnnT.VTTDT.T7'CT.PO r PIVT7T VV^WTPUT'TrXTTT OBPV^T.CnTUDIV T?nT^» 
viU<jLiv jvKijoiixg t. K l/\iixj I rxtoyi 1 ati X -R-N I >1ji</Vc I ciliay IrLtCftx uU 

VFS VI GVRE PQP AASEAFVKFADAHRS I EKFG I RLLKT I KPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFEYLS YCLKVKEMDDEEYSC 
IALGEPLYRVSTGNYEYRLILRCRQEARARFSQMRK17VLEKMEL 
ljjjUavci vyuj. v r vjj-j\jr\xj v o x noi\i iviu\,in vxji\x/rixj v r ricivu 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


! 5596 


698 


219 


gavlapsslpaaelaaqgesqsledlsntsrptsevykisfifp 
ngdkydgdctr!tssgiyerngigihttpngivytgswkddkmng 
fgrlehfsgavyegqfkdnmfhglgtytfpngakytgnfnenrv 

x\.or»\an» x l nx\^\3 x i\fnuv v x rxir icLou x 


5597 


3 


731 


ISCK^tAAIX3QSSLPASWRSVTLTHVEYPAGDLSGHLLAYLSLSP 
VFVIYGFVTLIIFKMLHTISFLGGLAI^EGVlWLiraWIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYLRMHQTNNA 
RFLDLLWRHVLSI^lxLAVAFLVSYSRVYLIjYHTWSQVLYGGIAG 
GIJ^IAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAIi 
VPLMSGVPPHPPAPSPCCSGQTMLKMLSL^LLIiAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLS CCLRSDS PGLGRLBNKI FS VTONTECGKLLEE 
I KCALCS PHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHI PGFLQTTADE FCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGIiRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYIjDIHKLVQSGIKGGDERGLL 
SIxAFHPOTKKNGKLWSYTra 

WDwrvwnT ,p tupuitt . wart.vp wtTjmm .t . TrrcDnrcpT .v t t t /^tvzm 

XM rnW VUXirt ±.t\£\. v r i ir,uMMiHK i\mi a -a. iiiii<uKiAir Xj I X. XXjoXAjIu 

ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQIlKGKDYESEPSLLEFKPFSNGPIiVGGFVYRGCQSERL 
YGS YVFGDRNGNFLTLQQS PVTKQWQBKPLCLGTSGSCRGYFSG 
HIl^FGEDELGEVYILSSSKSMTQTHNGKLYIQVDPi^PIiMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPAWSSWSCAL 
VPLLGSGVPPHPPAPS PCCSGQTMLKMIiS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNP PKRLKRRDRRMMSQLELLSGG 
EMLCX3GFYPRx^CCLRSDSPGU5RLroKIFSVTNNTECGKLLEE 
I KCALCS PHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GH I PRFT.OTTADE FCFYYARKDRGrjTFPDFPRKOVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYxjDIHKLVQSGIKGGDERGLIj 
SlxAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFIiYI I LGDGM 
ITLDDMEEMDGx^DFTGSVLRxxDVDTOMONVPYSIPRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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SEQ 
ID 1 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERIi 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


S LRVLSGHLMQTRDLVQPDKPAS PKF I VTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELS VAQKPEKLLERCKYWPACKNGDE CAYHHP I S P CKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5601 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPAS PKF I VTLDGVPS PPG YMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQ KPEKLLERCKYWPACKNGDE CAYHHP IS PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5602 


246 


766 


YHTS CTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAEMVA I DQVLDW CRQSGKS P S E VFEHTVL YVTVEPC I M C 
AAALRLMKIPLVVYGCQNERFGGCGSVIiNIASADLPNTGRPFQC 
I PG YRAEEAVEMLKTFYKQENPNAPKS KVRKKE CQQI LNMF 


5603 


1 


565 


FRGRT P I S GGERG CAQ Y P I P ATPARSGENRTM PGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CFGFEDLHFRWTYNSSDAFKIIjIEGTVKNEKSDPKVTLKDDDRI 
TLVGSTKEKRNNISIVLRDLEFSDTGKYTCHVKNPKENNLQHHA 
TIFLQWDRRMQ 


5604 


1 


1506 


EDI F PAQLLKLQRHERVWQQE P PVRDHRS WGGSGAGGVAGREWT 
DCKK3VJ\LGGHYMAEGEGYFAMSEDELACSPYIPLGGDFGGGDFG 
GGDFGGGDFGGGDFGGGGS FGGHCLDYCESPTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKVVRRRLAEKRIGVR 
DVRLNGSAASHVLHQDSGLGYKDLDLI FCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPTI IGESVYGDFQEAFDHLCNKI I ATRNPEE IRGGG 
LLKYCNLLVRGFRPASDEIKTLQRYMCSRFFIDFSDIGEQQRKL 
ESYLQimFVGLEDRKYEYLMTLHGVVNESTVGLMGHERRQTLNL 
ITMLAIRVLADQNVI PNVANVTCYYQPAP YVADANFSNYYIAQV 
QPVFTCQQQTYSTWLPCN 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 

MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 

KALRSLRRYPLPLRSGKEAKILQHFGDGIjCRMLDERLQRHRTSG 

GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP 

ARHSGARVILLVLYREHLNPNGHHFLTKEELLQRCAQKSPRVAP 

GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 

GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 

GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 

VWVAQETNPRDPANPGELVLDHIVERKRLDDLCSSIIDGRFREQ 

KFRLKRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 

FFVKRTAD I KESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE \ 

SGAMTSPNPLCSLLTFSDFNAGAI KNKAQS VREVFARQLMQVRG 

VSGEKAAALVDRYSTPAS LLAAYDACATPKEQETLLSTI KCGRL 

QRNLGPALSRTLSQLYCSYGPLT 


5606 


3 


1099 


GRS RC PGPGARGGTMS PRS CLRS LRLLVFAVFS AAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQRQVQMCKRNLEVMDSVRRGAQLA 
IEECQYQFRNRRWNCSTLDSLPVFGKWTQGTREAAFVYAISSA 
GVAFAVTRACS SGELEKCGCDRTVHGVS PQGFQWSGCSDN I AYG 
VAFSQS FVDVRERSKGAS S S RALMNLHNNEAGRKAI LTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFRQVGHALKEKFDGATEVEPRRVG 
SS RALVPRNAQFKPHTDEDLVYLEPS PDFCEQDMRSGVLGTRGR 
TCNKTS KAIDGCELLCCGRGFHTAQVELAERCS CKFHWCCFVKC 
RQCQRLVELHTCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine. K=Lvsine 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PP VCNPAEAMPS PGTVCSLLLLGMLWLDIaAMAGSS PLS PEHQRV 
QQRKESKKPPAKLQPRALAGWLRPEDGGQAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R I QTEPKYTG I WHCVRDTYHRBRVWGFYRGI.LL PVCTV<?I,V<; <5 P 

VFGTYRHCLAHICRLRFGNPDAKPTKADITLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQ KQQRRLSASGPLAVP PMCP VP PACPB P 
KYRGPLHCLATVARE EGLCGLYKGSSALVLRDGHS FATYFLSYA 
VLCEWLSPAGHSRPDVPGVLVAGGCAGVLAWAVATPMDVIKSRL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KRIREAKRSARPELKDSLDWTRHNYYESFSLSPAAVADNVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDITOGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EH P KRRKLLED YKVP KF FTDDLFO V AGEKRR P P YRW PVMR P PR Q 

GTGIHIDPLGTSAP^NALVQGHKRWCLFPTSTPRELIKVTRDEGG 
NQQDEAITWFNVIYPRTQLPTWPPEFKPLEILQKPGETVFVPGG 
WWHWLNLDTT I AI TQNFAS STNF P WWHKTVRGRPKLSRKWYR 
ILKQEHPELAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


54 

• 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGS INTLSAKWADNFMAEG 

CGGSKEHSFQHPFLC2AVGMFLGEFSC1jAAFYLLRCRAAGQSDSS 
VDPOOP FNPLL FLP PA!L CDMTGTS "LM WAT iNMT <3 TV Q cj PAMT .P a, & 

VI I FTGLFS VAFLGRRLVLSQ WLG I LATI AGLWVGLADLLS KH 
DSQHKLSEVITGDLLIIMAQIIVAIQMVLEEKFVYKHNVHPLRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDALDAFCQ 
VGQQPL I AVALLGNISS I AFFNFAGI S VTKELSATTRMVLDSLR 
TWIWALSLALGWEAFHALOILGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEESEQERLLGGTRTPINDAS 


5611 


2 . 1 

r 1 


577 


FVLPNRI/3IPGSTFRGPGACASSSSI«AASAKPGAGGSPALiAMSG 
ELSNRFQGGKAFGLLKARQERRIAEINREFLCDQKYSDEENLPE 
KLTAFKEKYMB FDLNNEGE I DLMSLKRMMEKIXjVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVLKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEENHIFELMQAMWLCKHLNS 
SLLTLENLILNEFSYTATEARRLYliQRKTVPSALLVQLIQERLA 
EEDCIKQGWILDGIPETREQALRIQTLGITPRHVIVLSAPDTVL 
IERNLGKRIDPQTGEIYHTTFDWPPBSEIQNRLMVPEDISELET 
AQKLLE YHRNI VRVI PSYPKI LKVISADQPCVDVFYQALTYVQS 
NHRTNAP FTPRVLLLG PVGS 


5613 


115 


1279 


RG VD PALRRAE KML P LS I KD DE YKPP KFNLFGKI SGW FRS I LS D 
KTSRNLFFFLCLNLSFAFVELLYGIWSNCLGLISDSFHMFFDST 
AI LAGLAAS VI S KWRDNDAFS YG YVRAEVLAGFVNGLFLTFTAF 

FIFSEGVERALAPPDVHHERLLLVSILGFVVNLIGIFVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHS HEVKHGAAHSHDH 
AHGHGHFHSHIX3PSLKETTGPSRQILQGVFLHILADTLGSIGVI 
ASAIMMQNFGLMIADPICS ILIAI LIWS VI PLLRESVGILMQR 
TPPLLENSLPQCYQRVQQLQGVYSLQEQHFWTLCSDVYVGTLKL 
IVAPDADARWILSQTHNIFTQAGVRQLYVQIDFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
AP WGARQRU3 VMAELQQLQE FE I PTGREALRGNHS ALLRVAD YC 
EDNYVQATDKRKALEETMAFTTQALASVAYQVGNLAGHTLRMLD 

lqgaalrqvearvstlgqm\^™ekvarreigtlatvqrlppg 
qkviapenlppltpycrrplnfgclddighgikdlstqlsrtgt 
lsrksikapatpasatlgrppripepvhlpwpdgrlsaassas 
slasagsaegvggaptpkgqaappapplpssldpppppaavevf 
qrpptleelsppppdeelplpldlpppppldgdelglpppppgf 
gpdepswvpasylekvvtlypytsqkdnelsfsegtvicvtrry 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucieociae 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

JLOCaUlUIl 

pi-iyypqrinnH i ncr 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenvl alanine . G=Glvcine* 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








SDGWCEGVSSEGTGFFPGNYVEPSC " 


5615 


9 


1558 


AliGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGISFVQTIMILLKGNIGTG 
LLGLPLAIKNAGIVLGPISLVFIGIISVHCMHILVRCSHFLCLR 
FKKSTLG YSDTVS FAMEVS PWS CLQKQAAWGRS WDFFIjVITQL 
nprQWTVFLAENWOVHEGFLESKVFISNSTNSSNPCERRSVD 
LRIYMLCFLPFIILLVFIRELKNLFVLSFLANVSMAVSLVIIYQ 
YWRNMPDPHNLP I VAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKESKRFPQALNIGMGIVTTIiYVTLATLGYMCFHDEIKGSITLN 
LPQDVWLYQS VKI LYS FGI FVTYS IQFYVPAE 1 1 1 PG I TSKFHT 
KWKQICEFGIRS FLVS I T CAGAI L I PRU) I VI S FVGAVS S STLA 
L I LP PLVE I LT FS KEH YNI WMVLKN I S IAFTG WGFLLGTY I TV 
EEIIYPTPKWAGTPQSPFLNLNSTCLTSGZjK 


5616 | 


1 


719 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLSSGDLLRDNMLRGTEIGVLAKAFIDQGKLIPDDVMTRLAL 
HELKNLTQYSWLLDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
TKORLTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


P WRGRGSRPRGAGAMAEE QVNRSAGIiAPD CEASATAE TTVS SVG 
TCEAAGKSPEPKDYDSTCVFCRIAGRQDPGTELIiHCENEDLICF 
KDI KPAATHHYLVVP KKH IGNCRTLRKDQVELVENMVTVGKTIL 
ERNNFTDFTNVRMGFHMPPFCS I SHLHLHVLAPVDQLGFLS KLV 
YRVNS YWFITADHLI EKLRT 


5618 


3 


1692 

m 


YLI^INLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKS I RLLSEI EKLVGTSVPGLLE I ILSSS I LE I YN 
HI LQTWPDEDVTFRKS CATKRKLSN INQEEASGTSLHQKAIMT 
FTCHNEINAFWLSRGSQILSLNSTRFLTKIjGHCSSACPSDSVS 
QTNIQNIiKGLNSPVLIGKSKDPSCVAKVSEEGKPAISTQKMELH 

\ro djp <s Tvpn v rvDJi. <5 pt ,wt ptftjks s ttvyi gshshrmkavd FY 

SGKVKWEQILGDRIESSACVSKCGNFIWGCYNGLVYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 
VWKSKCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVI W 
KHSCGKPLFSSPQCCSQYICIGCVlXSNLliCFTHFGEQVWQFSTS 
GP I FSSPCTSPSEQKI FFGSHDCFI YCCNMKGHLQWKFETTSRV 
VATPFAFHNYNGS NEMLLAAAS TDGKVW I LE SOSGOLQSVYEIiP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DS PVLPTSGNVISTAQPAQ PWSAVEAALRSLGS PPGAGRGCP CP 
AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCX3MSCMYSFLGHCSVL 
LWGTKGRGSGS PS S PGCCLHP PAQHSQDLPLVHVDVGWQPPLGP 
WGLRPGLI03ERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGST ■ 

AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 

L I ADAKTL IDKARVETQNHWFTYNETMTVESVTQAVSNLALQFG 

EEDADPGAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFVQCDAR 

AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVMEEKLWA 

TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


WEFVE YTATDANVKKESLSS VQQLG I KMTVRYGKFLSLLKDGA 
ENDLTWVLKHCERFLKQQQTSIKSSLLCLQGNYAGHDWFVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
G IHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQ I CLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQVFLKEEALHGFRVSDYFEYMEILEQNYRTVLLRDMR 

NIRLQST 


5622 


1122 


456 


" AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRNPMKAMY 
PGTFYFQFKNLWEANDRNETWLCFTVEGIKRRSWSWKTGVFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLT I FTARL YYFQYPC YQEGLRSLSQEGVAV 
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SEQ 
ID 
NO: 


j Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

| sequence 


1 Predicted end 
nucleotide 
location 
corresponding 

1 to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, TsThreonine, V= Valine, 
WaTryptophan, Y»Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








E IMD YED F KYCWENF VYNDNEP F KPWKGLKTN FRLLKRR LRES L 

Q | 


5623 


3 




FLPFFIRAPKISRNGQWLFTFTTPFPFANKALPGWBGIVPACFW 
RKKILTPSTGTMELLQVTILFLLPSICSSNSTGVLEAANNSLVV 
TTTKPS ITTPNTESLQKNVVTPTTGTTPKGTITNELLKMSLMST 
ATFLTS KDBGLKATTTDVRKNDS 1 1 SNVTVTS VTLPNAVSTLQS 
SKPKTETQSSIKTTEIPGSVLQPDASPSKTGTLTSIPVTIPENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
| SAQGKTKN 


5624 


159 


j 898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SSGSRKLYFDTHALVCLLEDNGFATQQAEIIVSALVKILEANMD 
IVYKDMVTKMQQEITFQQVMSQIANVKKDMIILEKSEFSALRAE 
NEKIKLELHQLKQQVMDEVIKVRTDTKLDFNLEKSRVKELYSLN 
EKKLLE LRTE I VALHAQQDRALTQTDR KI ETEVAGL KTMLE S HK 
| LDN IKY LAGS I FTCLTVALG FYRLW I 


5625 




J 1180 


TI P SSAAAQRAGP PAGALEALS PGGARAHAERRGEMRATPLAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTEYTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHS LVRSRHR I PEPEAAVLFRQMATALAHCHQHG LVLRDL KL 
CKFVFADRERKKLVLENLEDSCVLTGPDDSLWDKHACPAYVGPE 
ILSSRAS YSGKAADVWSLGVALFTMLAGHYPFQDSE PVLLFGKI 
RRGAYALPAGLSAPARCLVRCLLRREPAERLTATGILLHPWLRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDRBWLYG 


5625 


3123 


2011 


PPRAI/3SVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYE FHLEFLDLVKPE PVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLES EG S P ETLTNLRKG YL FM YNLVQ FLGFS W I FVN 
LTVRFC I LGKES FYDTFHTVADMMYFCQMLAVVET I NAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
S I PI FNETGRFS FTLP YPVKI KVRFSFFLQ I YL IM I FLGLY INF 
RHLYKQRRRRYGQ KKKKIH 


5627 

i 1 


3123 


2011 1 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAIS I 
TENVLHFKAQGHGAKGDNVYE FHLEFLDLVKPE PVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGS PETLTNLR KGYLFMYNLVQFLGFSWI FVN 
LTVRFCILGKESFYDTFHTVADMMYFCQMLAWETINAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYS F YMLTCI DMDWKVLT WLR YTLW I PLYPLGCLAEAVSVIQ 
SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQ KKKKIH 


5628 


75 


1455 

1 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 

SLS PVARS FSACSVGLGRSS YRATSCLPALCLPAGGFATS YSGG 

GGWFGEGILTGNEKETMQSLNDRLAGYLEKVRQLEQENASLESR 

IREWCEQQVPYMCPDYQSYFRTIEELQKKTLCSKAENARLWEI 

DNAKLAADDFRTKYETEVSLRQLVESDINGLRRILDDLTLCKSD 

LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 

DLNRVLEEMRCQYETLVENNRRDAEDWLDTQSEELNQQWSSSE 

QLQSCQAEI IELRRTVNALEIELQAQHSMRDALESTLAETEARY 

SSQLAQMQCMITNVEAQLAEIRADLERQNQEYQVLLDVRARLEC 4 

E INTYRGLLES EDSKLP CNPCAPD YS PS KS CLPCLPAASCGPSA 

ARTNCSARPICVPCPGGRF 


5629 


2287 1 


938 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAASRRSPAARPPV 
PAP P ALPRGRPGTEGS TS LS AP AVL WAVA WWWSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDGITNKLIGCYVGNTMEDWLVRIYGNKTELLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAI FRLIARQLAKIHAIHAHNGWI P KSNL WLKMGK YFS L I PTGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R^Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADED I NKRPLSD I PS SQ I LQEEMTWMKEI LSNLGSPWLCHNDL 
LCKNI IYNEKQGDVQPIDYEYSGYNYLAYDIGNHFNEFAGVSDV 
DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 
NQFALASHFFWGLWALIQAKYSTI E FDFLGYAI VRFNQYFKMKP 
EVTALKVPE 


5630 


1194 


278 


GFWAIAQTCAHHIiPPGS P WLVPAS PWRLPEMS S FGYRTLTVALF 
TLICCPGSDEKVFEVHTOPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
NSNVSVYQPPRQVILTLQPTLVAVGKSFTIECRVPTVEPIiDSLT 
LFLFRGNETLHYETFGKAAPAPQBATATFNSTADREDGHRNFSC 
LAVLDLMSRGGNIFHKHSAPKMLEIYEPVSDSQMVI IVTWSVIj 
LSLFVTS VLLCF I FGQHLRQQRMGTYGVRAAWRRliPQAFRP 


5631 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAALQALKRKKRYEKQLAQIDGTLSTIEFQREALENANTN 
TEVLK^GYAAKAMKAAHDNMDIDKVDEtJ^QDIADQQELAEEIS 
TAI SKPVGFGEE FDEDELMAELEELEQEELDKNLLE ISGPETVP 
LPNVPSIALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGWS P PRRLWWGS LG AAQRPAVP VSGLARS LHVE TRRPHRRA 
SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGP P PAYAPTNGD 
FTFVSSADAEDLSGSIASPDVKLNLGGDFIKESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 
NRQWRDNPDFWGPLAWLFFSMISLYGQFRWSWI ITIWIFGS 
LTI FLLARVLGGEVAYGQVLGVIGYSLLPLIVIAPVLLVVGSFE 
WSTLIKLFGOTWAAYSAASLLVGEEFKTKKPLLIYPIFLLYIY 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLiLSSHLHQVPFFCCFTWCLCN 
CLFENSVSKLYMLCFNFFMSIFFYSLSITKLNLIYLWGLSYQSL 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRPRAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLSSRFRRVDIDEFDENKFVDEQEEAAAAAAEPGPDPSEVD 
GLLRQGDMLRAFHAALRNSPVNTKNQAVKERAQGVVLKVLTNFK 
SSEIEQAVQSLDRNGVDIiLMKYIYKGFEKPTENSSAVIjIjQWHEK 
ALAVGGLGS I IRVLTARKTV 


5635 


3 


943 


DRGPRSTATDTGRARVSFWRFPLDPGVKNSNVQISGEKRRFRTL 
RSLFHPFPVTRSGAPRAVLVGSSWPAKMVAPAVKVARGWSGLAL 
GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
EKYTOELKKTQLIKAAPAGKTSSVFEDPVISKFTNMMMIGGNKV 
LARSLMIQTLEAVKRKQFEKYHAASAEEQATIERNPYTI FHQAL 
KNCEPM IGLVP ILKGGRF YQVPVPLPDRRRRFLAMKWMITECRD 
KKHQRTLMPEKLSHKLLEAFHNQGPVIKRKHDLHKMAEANRALA 

HYRWW 


5636 


2253 


1143 


LEDTICQHPPAEKKLYLYHRKLREVERNGIPRLPKDVFMDTHQG 
LTDVRAKVTGFSEGWDSVKGGFSS FSQATHSAAGAWS KPRE I 
ASLIRNKFGSADNI PNLKDSLEEGQVDDAGKALGVI SNFQSS PK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDIiTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRISKMELQQQQQQWQLEGLENATARNIiLGKLINI 
LLAVMAVLLVFVSTVANCVVPLMKTRNRTFSTLFLVVF I AFLWK 
H WDALFS YVERFFSS PR 


5637 


948 


2532 


M S F CGARANAKMMAAYNGGTS AAAAGHHHHHHHHLPHIiP P PHLH 

HHHHPQHHLHPGSAAAVHPVQQHTSSAAAAAAAAAAAAAMLNPG 
QQQPYFPS PAPGQAPGPAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQQLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDI LQPPHI DYFEE I YWTE 
LMQSDLHKI I VS PQPLS SDHVKVFLYQI LRGLKYLHSAGI LHRD 
IKPGNLLVNSNCVLKICDFGLARVEELDESRHMTQEWTQYYRA 
PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQLDL 
ITDLLGTPSLEAMRTACEGAKAHILRGPHKQPSLPVLYTLSSQA 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L»Leucine, MsMethionine, N»Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




t 




THEAVHLLCRMLVFDPYKRISAKDALAHPYLDEGRLRYHTCMCK 
CCFSTSTGRVYTSDFEPVTNPKFDDTFEKNLSSVRQVKEIIHQF 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 

• 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRIjLVS ASQDGKL 1 1 WDSY 
TTNKVHAI PLRS S WVMTCAYAPSGNYVACGGLDNICS I YNLKTR 
EGNVRVSRELAGOTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNI I CGITSVS FS KSGRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVS CLGVTDDGMAVATGS WDS FLKI WN 


5639 


125- 

- 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTNKVHAI PLRS S W VMTCAYAPS GN YVACGGLDN I CS I YNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTrCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNI ICG I TSVSFS KSGRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVS CLGVTDDGMAVATGS WDS FLK I WN 


5640 


280 


1092 


QQGNKKTMLSHNTMMKQRKQQATAIMKEVHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLEALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTDPRFMS FVNPLSGRRS FNRTPKG W I SENI P I VI TTE PTDDTT 
VPESEDL 


5641 


27 


332 


CKHNCNGDVKLLSNQMDKLFAFHLFTFHGLLHFLDGSIQKLIQA . 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGES FVLSMI VTG 


5642 


199 

* 

: 


1247 


ITPCRMDFLVLFLFYLASVIiMGLVLICVCSKTHSLKGLARGGAQ 
I FS C I IPECLQRAMHGLLHYLFHTRNHTFI VLHLVLQGMVYTEY 
TWEVFGYCQELELS LHYLLLP YLLLGVNLFFFTLTCGTNPGI I T 
KANELLFLHVYEFDEVMFPKNVRCSTCDLRKPARSKHCSVCNWC 
VHRFDHHCVWVNNCIGAWNIRYFLIYVLTLTASAATVAIVSTTF 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
LGFVWLS FLLGG YLLFVLYLAATNQTTNEWYRGDWAWCQRCPL 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQB 


5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRWMHRRGVGAGAIAKKKLAEAK 
YKERGTVLAEDQIiAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQFQDMCATIGVDPLASGKGFWSEMLGVGDFYYELGVQIIEVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDL I RAIKKLK 
ALGTGFG 1 1 PVGGTYLIQSVPAELNMDHTWLQI»AEKNGYVTVS 
E I KAS LKWETERARQ VLEHLLKEGLAWLDLQAPGEAHYWLPALF 
TDLYSQE ITAEEAREALP 


5644 


83 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAIEDKDMQQKBQQ FREW FLKE FPQ IRWKI QES I ERLRVIANE 
I E KVHRGCVI ANWSGSTG I LS VIGVMLAPFTAGLSLS I TAAGV 
GLGIASATAGIASS IVENTYTRSAELTASRLTATSTDQLEALRD 
ILHDITPNVLSFALDFDEATKMIANDVHTLRRSKATVGRPLIAW 
RYVPINWETLRTRGAPTRIVRKVARNLGKATSGVLWLDWNL 
VQDSLDLHKGEKSESAELLRQWAQELEENLNELTHIHQSLKAG 


5645 


537 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL 
YLCLQNSLLGHS S VEDARATMELYQ ISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRASI PATKRAS FLSS FI KMFFEELEYILGF 
LSLLKFHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
EEDTQRHETYHQQGQCQVLVQRSPWLMMRMGILGRGLQEYQLPY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end | 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRVLPLPI FTPAKMGATKEEREDTPIQLQELLALETALGGQCVD 
RQEVAEITKQLPPVVPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VLS ELCGRHEALRE VGAEWPP PTCS PN I CSGLQQAGNTDWSIiTM 
APQSLPSSRMAPIiGMLLGLLMAACFTFCLSHQNLKEFALTNPEK 
SSTKETERKETKAEEELDAEVLEVFHPTHEWQALQPGQAVPAGS 
HVRLNIiQTGEREAKLQYEDKFRNNIiKGKRLDINTNTYTSQDLKS 
ALAKFKEGAEMESSKEDKARQAEVKRLFRP IEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLS FGGLQWINGLNSTEPLVKEYAAFVLGAAFSSNPKVQVEAI 
EGGALQKLLVILATEQPLTAKKKVIiFALCSLLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRWTLLYDLVTEKMFAEEEAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTIXSVLIiTTCRDRYRQDPQIKjRTLASLQAEYQVLASLELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


MLQEQLDAINEE IRM IQEEKESTELRAEE I ETRVTSGSMEALNL 
KQLRKRGSIPTSLTDLSliASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDbRKHRRKLLS PVSREENREDKAT I KCETS PPS S PR 
TLRLEKLGHPAIiSQEEGKSALEDQGSNPSSSNSSQDSLHKGAKR 
KGIKSS I GRLFGKKEKGRL I QLSRDGATGHVLLTDSEFSMQE PM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTVVSWL 
ELWVGMPAWYVAACRANVKSGAIMSAIjSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECIiV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHEIKDVIjVWTNDQWHWVQSIGLRDYAGNLHESGV 
HGALLALDENFDHNTLALI LQ IPTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5650 


1172 


3006 


MLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEAliNL 
KQLRKRGS I PTS LTDLSLAS AS PPLSGRSTPKLTSRS AAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEGKSALEDQGSNPSSSNSSQDSLHKGAKR 
KGIKSS IGRIiFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWGMPAVfYVAAOlftNVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLAIQEWSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGI1PQYRSYFMECI1V 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHEIKDVLVWTNDQWHWVQS IGIiRDYAGNLHESGV 
HGALLALDENFDHNTLALII^IPTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQPWG*EARAKGPASESPRV*EGSGWEGPASP*TPGSTL 
AWGEGAGI R* ASGLTAAGAAS AAAA/ PPPTRGG PAPAGCGRAPP 
WPAPLRVPTHGRAPAPRSRAAPRAPALSHGTAAAALSPASPAGP 
ADP*LPGHSSQS PPRG *RWGRSRSAPAPAHPEHPAPAGSASASQ 
QTPGWPGS CCIiAQGWQAE PLGAPGAEDG \ P VP PQRG FPLGTLGS 
PAGSWAGLAGYG*AGAPGTQATAPRAAGQTPVAAAPNCRV*GSA 
PALHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 
GWRAGISPELLGAAGLSDNWARCPGPGPAE*GGQPGCRTIPASA 
CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQHIHQKS FSCPE PACGKS FNFKKHLKEHMKLHSDTRD YI 
CEFCARSFRTSSNbVlHRRIHTGEKPLQCEICGFTCRQKASLNW 
HQRKHAETOAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 


5653 


66 


1401 


RGRLQSRGRLTLGLVLLLLD I LGARQHGQRVSHGWKGGFLTAPL 
CFPQPCQPGTRRGRRRSLKEATEPQLAMAEEFVTLKDVGMDFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGSPEATSPDVTETKNSPLMEDFFEEGFSQEI/SRDVIQ 
G WLLELQFRRS LYRGHLVR * FARRSRKS S EV* YCHQRGKSHGMQ 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine r C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, NaAsparagine , 
P=Proline, Q=Glutamine , .R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=*Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES * IKERTQSCVHRFHGRRFHG\DNVSEKTLTPAKSKEYRGEFP 
SYSDHSQQDSVQEGEKPYQCSECGKSFSGSYRLTQHWITHTREK 
PTVHQECEQGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGBHQKTHTDSKSYNCN 
ECGKAFTRIFHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5554 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
NWKPFVYGGLAS ITAECGTFP IDLTKTRFQ IQGQTNDAKFKEI I 
YRGMLHALVR I GREEGLKAL YS G * VGLHAFLCHCS LFHMG I DFR 
PRLHRSQVKSLRCV * KEQ I A* * /MFSLLISTLISKYI YYAADVL 
EKL FYY IQVQTDNNKKI CLFKNI 


5655 


2 


867 


RP PGIRAPRQLHPAAGRRPDASARPRFRPTVLIjHDPFQLS FPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEM I PFKDEGDPQ\REKI FAE I VNPEEEGDLADI KSSLVNES 
EI IPASNGHEVARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PS YSS YS G Y I MM PNMNND P YMSNGS LS PP I PRTSNKVP WQ PSH 
AVHPLTPLITYSDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRRVP PIiPE FASG PGAAF FHSGRLQRS LTKDS AGC FSQ CRS RAM 
LVLRSGliTKALASRTLAPQVCSS FATGPRQYDGTFYEFRTYYLK 
PSNMNAFMENLKKNIHLRTS YSELVGFWSVE FGGRTNKVFH I WK 
YDNFPHRAEVRKAIiANCKEWQEQS 1 1 PNLARIDKQETEITYLI P 
WS KLQKP PKEGVYELAVFQMKPGGPALWGDAFE RAINAHVNLGY 
TKWGVFHTEYGELNRVHVLWWNESADSRAAVRHKSHEDPISWG 
GVRESVNYL\VSQQNM 


5657 


105 


1052 


GQRLQS PRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEES EEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKIiIjLYDTLQGELQERIQR 
LEEDRQS LDLS S E W WDDKLHARGS S RS WDS LP P S KRKKAPLVS G 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 

9 


2346 

* 


3541 

• 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLKCTDTELQLRRDAI FCQALVAAVCTFSEQLLAAIjGY 
RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLEDI WVTLS E LDNVTFS FKQLDENYVANTNVFYHIEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQQD INAQSLE KVQQYYRKLRAFYLERSNLPTDAST 
TAVKIDQLIRPINALDELCRLMKSFVHPKPGAAGSVGAGLIPIS 
SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGLLPKC 
IMQATDIMRKQGPRVEIIiAKNLRVKDQMPQGAPRLYRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGEVS P KGELGAWRGNSGR PKI I GRAAEAENEDRTLGRLLP 
GNERSQPRSPLRLI*APQLKAEAAADKGLAPVPPPFSSGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG \AAVAGAAGGARRFLCX3WEGFYGRPWVMEQRKEL 
FRRLQ KWELiNTYL 


5660 


229 


853 

* 


P VTMWAFS E LPMPLL INL IVSLLGFVATVTliI PAFRGHF I AARL 
CGQDLNKTS RQQ I PESQGVI SGAVFLI ILFCFI PFPFLNCFVKE 
QRKAFPHHE FVAL I GALLAI CCMI FLGFADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGOTTIWPKPFRPIIX3LHLDLGR*SYHCC 
PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL 
AGMAVTCD P KAFLS I CFVTLVFLQLPLAS I CQN* GTDS CASRG K 
AD FD VTG PHAPI LAMAGGHVELQ CQL F PN I S AEDME LRW YRCQP 
S LAVHMHERGMDMDGEQKWQ YRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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SEQ I 
ID 

NO: 1 


Predicted I 
beginning 
nucleotide 1 
location s 
corresponding 
to first j 
amino acid j 
residue of | 
amino acid 
sequence j 


Predicted end 1 
nucleotide 
location 
corresponding 
to first 1 
amino acid i 
residue of ) 
amino acid 
sequence | 


Amino acid segment containing signal pept ide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

, , v- m -J 7v#-»4 #3 TT~Ohom/T al anitlP f?=Glvcin&. 
GiUCamiC AGIO., r— • ciieuyj.ciiciLXA-*c# v- J V -»- l -jr v ' J -* AC " / 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine / V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








QNVELKAE K I KVIGNCDAKDFP I KYKERHPLE YLRQYPHFRCRT 
NVLGSILRIRSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGE 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHLAEFYMIBAEISFVDSLQDLMQVIEELFK 

AVE I LKQAS QN FTFTPEWGADLRTEHE KYLVKHCGN I PVFVINY 
PLTLKPFYMRDNEDGPQELEGSVA*HSLGLMILLSIWIGQP ! 


5663 


119 j 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
T rrvo vr«T? xrr»nT TWDr&va T PT7CT .PRWnfT.nc! TXOOCiT.YRKVMIjENYR 
NLVFLGIALTKPDLITCLEQGKEPWNIKRHEMVAKPPVICSHFP 
QDLWAEQDIKDS FQEAI LKKYGKYGHANFQLQKGCKS VDECKVH 


j 5664 I 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLVQENGQRKYG 
GP P PGWDAAP PERGCE I FIGKLPRDLFEDELI PLCEKI GKI YEM 
DMMMnTTKrPWT^r:Ya'F r VTT?<;NKVEAKNAI KOLNNYE IRNGRLLGV 
CASVDNCRLFVGGIPKTKK 


| 5665 


347 


702 j 


WQHL I ILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETEVKGKRKRGRPGRPPSTNKKPRBCS PGEKSR I EAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAALVFYSCIFII 
FYYAKDEWPFGEYFCQI LGA 


5667 


1 


695 


tttit -ncis pt r«T D c\?C T Pi/CT P\7PQZ1T.T.F2XWPM1jPKRRRARVGSP 
SGDAASSTPPSTOFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VXDACSSEATHWMEETSAEEAVSWQERRNLAA^ 
ISWLTESLGAGQPVPVECRHRIiEVAGPSKGPLSPAWMPAYACQR 
fvrr>T >**TTTTXTrrv*T ppat ptt TVT?A7\r;T7irriCTrr2PT.r.'T'Trr'Paz\.c;vTi'prRTi 

| PSPVTTLSQLQ 


5668 


691 


S 894 


reSFLFCIPDLFLQFLU5RKEEEAVLVGGEWSPSLIX3LDPQADP^ 
[ VIjVRAAlK'Ji\yAv*'^l* J l»^^^* 


5669 


407 


1 

1 


DSGAPEGLS P LMS TQEGLSMHAHPQAYTPFI YLHAR KRRGE I GD 

ARGAPTICYSGSPIGSPTTTPPTRPPSFl^HPAPHLIiASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPLLILCTVSVASYELAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWFQQKPGQAPVLVIYKDTERPSGIPERFSG 
1 oT>or , T'T\7 T rT TTCrtAnVFnRAnYFCY^ATDNFLWVF 


5671 


280 


524 


KFPPKKTP PEiIX3MES AITLWQFIjLQLLLDQKHEHLI CWTSNDGE 
1 FKLLKAKKVAKLWGLRKN^^ 


5672 


| 2 


557 


S DPYC I VKVDNE P 1 1 RTATVWKTLC P FWGEE YQVHLP PT FHAVA 
FYVT^DEDALSraDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 

1 QAWLLLPLP 


5673 


327 


696 


ITVADQISHWSAGRIKNRTRIPECIHSSAATTLAGPHTMEGESV 
KLSSQTLIQAGDDEKNQRTITWPAHMGKAFI<VmEl^^ 
j DVMIVAEDVEIEAHRWLAACS PYFCAMFTGDMS 


5674 


j 17 


] 984 


■1 Viriric'Mpnrc'PCBm Q^^/T/^ZVT.BLPnRT.WTDSilTEGFLI-GEVKGE 
AKNS ITDSQMDDVE WYT IDI QKYIPCYQLFSFYWS SGEVNEQA 
LKKILSOTKKIWVGWYKFRRHSDQIMTFRERLLHO 
DIi VFLLLTPSI ITESCSTHRLEHSLYKPQKGLFHRVPLVVANLG 
MS EQLGYKTVSGS CMSTGFSRAVQTHSSKFFEEDGS LKEVHKIN 
EMYAS LQEELKS I CKKVEDS EQAVDKLVKDVNRLKRE I EKRRGA 
QIQAAREK^IQKDPQENIFLCQALRTFFPNSEFLHSCVMSLKID 

1 MFLKVAVTTTTISM 


5675 


1 80 


753 


EG SRRG PTRLARLS ARAGRLHF P PGFS SRL I HFRGVS E CRR PPG 
KSGVP VS APG SDG KWWE ERPGMF S LMAS CCGWF KRWRE P VRKVT 
LLMVGLDNAGKTATAKGIQGEYPEDVAPTVGFSKINLRQGKFEV 

| TI FDLGGG I R I RG I WKNYYAE S YG VI F WDS S DE ERME ETKEAM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine , V«Valine, 
W= Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMLRHPRI SGKP ILViiANKQDKEGALGEADVIECLSLEKLVNE 
HKCL 


5676 


2 


930 


FVS S PPPRP VQPARPGGFGLSGRRSLLCQVASTPAHVGVMRS PV 
RDLARNDGEESTDRTPLLPGAPRAEAAPVCCSARYNIiAILAFFG 
FFIVYALRVNLSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTGKKYQ WDAETQGW ILGSF FY GYI I TQ I PGGYVAS KIGGKM 

LLGFGILGTAVLTLFTPIAADLGVGPLIVLRALEGLGEGVTFPA 
MHAMWSS WAP PLERS KLLS IS YAGAQLGTVT SLPLSGI I CYYMN 

WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHIjVAAARAA 
VTAETHPLPLLAPIAVCQSVKS PAACQVRPRPRAVALPAALGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLWLIHH 
RKHAGPIVSVWHRELRKAKSNRKLTFLYLANDVIQNSKRKGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 

iqqlklsmedskspppkateekkslkrtfqqiqeeedddypgsy. 
s pqdps agpllteel i kalqdlenaasgdatvrqki as lpqevq 
dvsllekitdkeaaerlsktvdeaciirnrgpgts 


5678 


3 


593 


SSSPPSSTPSLPIiPFYLLLGQLRIiQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SIAEFTEQFNQLHNRRl^NIiQLGPLGRDPPQECSTFSPTDSGEB 
PGQLSPGVQFQRRQNQRRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVSKRLS LPMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE ! 
DDAEVQQECLHKFSTRDYIMEPS I FNTLKRYFQAGGS PENVIQL 
LS ENYTAVAQTVNLLAEWL IQTGVEPVQVQETVENHLKSLLIKH 
FDPRKADS I FTEEGETPAWLEQMIAHTTWRDLFYKLAEAHPDCL 
MLNFTVKVGRVLELRRKVTMNVYFWLIiVCFL 


5680 

• 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 


45 

• 


869 


LLCAKTLGVRTKESQABGYNRSGINNHQAEDPRFCPSFCWMRSA 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL • 
FTSAGLWIVYFIAVEDDKIliPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFS QVMNMAAFLALVVAVLRFI QLKPKVLNPWLNI SGLVA 
LCLAS FGMTLLGNFQLTNDEE IHNVGTS LT FG FGTLTCW I QAAL 
TLKVNIKNEGRRVGIPRVILSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 


5682 


39 


622 


PSRS CLGTMRKWRHREVNLPEVTQQDAVCPAP I PS PGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGLLLLVPLLLiLPGSYGL 
PFYNGFYYSNSANDQNLGNGHGKDLLNGVKLWETPEETLFTYQ 
GAS V I LP CRYRYE PALVS PRRVRVKWWKLSENGAPEKDVL VAIG 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCTVCS KKFAS FNAYENHLKSRRHVE LE KKAVQAVNR KVEMM 
NEKNLEKGLGVDS VDKDAMNAAIQQAI KAQPSMS PKKAPPAPAK 
EARNWAVGTGGRGTHDRD PSEKPPRLQWFEQQAXKIiAKHS EDD 
SEDEEHDLC 


5684 


195 


677 


TWCFRGYLGPRVI MKALDEP P YLTVGTDVSAKYRGAFCEAKI KT 
AKRLVKVKVTFRHDSSTVB VQDDH I KGP LKVGAI VEVKNLDGAY 
QEAVINKLTDASWYTWFDDGDEKTLRRSSLCIiKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


LLLQQPWHCFLLFPPFRFSHHMI PGPPGPHTTGI PHPAIVTPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQI LGRR WHALS RE EQAKYYELARKE 
RQLHMQLYPGWS ARDNYVS PS S I PVALHS 


5686 


128 


1181 


CTWWQVNITLLD INDNHPTWKDAPYYINLVEMTPPDSDVTTWA 
VDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 
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SEQ 1 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
ammo acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acia, rs=pnenyiaianine , b^iycme, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagi ne , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
w=Trypcopnan, i-iyrosine, a— unitiiawu, — ouup 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QNLPFVAEVLEGIPAGVSIYQWAIDLDEGLNGLVSYRMPVGMP 
RMDFLINSSSGVWTTTELDRERIAEYQLRVVASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
NDVGIiNAE LS YF ITGGNVDGKFS VGYRDAWRTWGLDRETTAA 
YMLILEAIDNGPVGKRHTGTATVFVTVLDVNDKRPIILQSSYV 


5687 


17 


917 


AAPPAPPDG / PPP / PPPAPPT/ PGPAA/ APAbbCUPKJjbAljrKAA 
QGDGGAAAVGHVLWPAVGPVRVNPGLQTPVPRPELLPGP \ SSS 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRMPSTSASE/AAGGQGACTHAKGSETPPPASPQTSEPAPSP 
LP PHLTGGPGM YS SEAKXiPNS F S CLGJjAIj rvjAvar X w o lHoAnkj i 
PPVIjPHVCTPSLANPQP\AVGPEASSIjPLGVSGIGMSA/SAPIS 
S S P FVAIGS CWLRG I PP PGSGFLCPGRAPGP VP I TTHGQEGQGP 
VLDI 


5688 j 


1 


420 


LTKWDLFGNCYRLLKTGIEHGAMPEQVGVYWYS / CLYDSRKLFF 
* SHMI IRSLL* KVIDDSLGQLPLLRELLL* * LNVIDRCI I LAYV 
IiRVEKTFAITYLKNFTVKVDFSLLGE I PLISMAAILKLWIMKID 
DGYIPAVF 


5689 


1504 


3 


HELSGKHISMVSGNTCNWHPGGHS PGGGGQGEITSKDRGE IPAL 
XWA/RKPIGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\S PLPACPPRAwPKAGAVASATGTG \PQLPGSRCjKU 
KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 
PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG* PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCMYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSAS SSHR* GG*ERARAGAGHRGST* A 
S S KIEQGRPRPGPTSDALADVEGGAES /GPHPWPLPGTLPNR/ P 
GS PP PA* AS AGRKGTVSTLGGGLL 


5690 


| 1424 


58 


PSPPAGVCAAPAPLPLLALARRDRRPCS PGAEAAPWQTGGPAID 
GAWRTS VS ALRRGATG / APCS PGAEAAPWQTGGPAI DG\DGELP 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAP PGGPAP 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR* SQRT* ERARFKH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
S LS LLG P / PGAHNLDTAPQDR * HGP * GDKRGAPGVAGEDPRPP * 
GNFVR*LLI*MP/GVA*KHV3 JloPr IjLrFoJbvjJiiWijUywL'tovjJNLiroi r 
KG * SHPAFTKST* SMEAEKS YWNHPHR\DRGRQGVR INCLRVGE 
SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 

PEGDPGLNS PGLLP 


5691 


1 107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGILSFIED 

GDAPLI FSPYLSLTGNCGFAMLVE ITERAMAH\CGSPGGPSLWG 
GVGVYVLLESVPLSYS 


5692 ■ 


| 1193 


548 


TQAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPL\QSVPCIQK 

noT^nnvriT /pr TV^C*"^PPr2DVrarrW^P\7T?PPTrnP^CY?PA < ?'J?MPTi 

TS RSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVS Q 
RLNLPVMGATRSNLQPPRKVAVPGPTR*RDQDSKQDFSSKPLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


| 1258 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
pnw /pnq ap^HAP /PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 

__ 


GS KEPARS LHRRGSGHKSS AGKWGSVTLSTAGALG* KQLHQ* WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKIiAVKNTTGLQRSSSDAGRDRLSDAKKPPSG 
I ARPS TSGS FG YKKP P P ATGTATVMQTGGS ATLS KI QKS S G I P V 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X= Unknown, +=stop 
Codon, /^possible nucleotide deletion, 

J \=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSS IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAJCAKAVALDSDNISLKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSLAl^DKVNSNSLDLPSSS 
DTTQCI 


5695 


3 


1336 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 

qrcl\nnlsseefnassslnslpstptasrrnstivlrtdsekr 
slaesglswfseseekapkkleydsgslkmepgtskwrrerpes 
cddsskggelkkpislghpgslkkgktppvavtspithtaqsal 

j kvagkpegkatdkgklavkntglqrsssdagrdrlsdakkppsg 
iarpstsgsfgykkpppatgtatvmqtggsatlskiqkssgipv 

1 kpvngrktsldvsnsaepgflapgarsniqyrslprpaksssms 
vtggrggprpvsss idpsllstkqggltpsrlkeptkvasgrtt 

PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
1 DTTQCI 


5696 


3 

• 


1338 


GSKE PARS LHRRGSGHKSSAGKWGSVTLS TAG ALG*KQLHQ*WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSS KGGBLKKPI SLGHPGSLKKGKTPPVAVTS PITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSS IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPWQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 

r 

I 

t 1 


PSEALSPPACPSAPAPRRSIISRLFGTSPATEAAPPPPEPVPAA 
QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
S S E EEAE VAAPT KGPAP APQQCS E P ETKWS S I PAS KPRRGTAPT 
RTAAP P WPGG VS VRTG P EKRSS TRP PAEMEPGKG EQASS S ES D P 
EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


569B 


2 


666 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVDFT 
QEEWQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 
GEEP W I VEGEFLLQSYPDE VWQTDDLI ERIQEEENKPSRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
NASSEYISSDGRYARMKADECSGCGKSLLHIKLEKTHPGDQAYE 
FNQ 


5699 


2 

* 


1448 

* 1 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPS FRG PRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVAI *DEKPLA 
RPS / GRTNAPF PQGQKP AG KAAPGPAAAGRVAMR\ PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL*RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS *HLDPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPILFQNPSGALRSRRTEPAGWVPPTRHE*DDG*TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRLAAGAHTSASARCPPAAAA 
GWQPRRPGFAGRAAL PGPPH P PS S * RELGGLPGPGW * TLDPLPA 
HPAHPPGSAPPWGALGGWAAARASLPWS PSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 


NGHKG VWE IN I Y * RRSNI HKNS KS E SHLNQDHS FPP PTPNS ARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 f 


IFEKICSDTQEFISPEINPQICSWLIFDKGAK/NHATGKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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i se q i 

ID 
NO: 


Predicted 1 
beginning j 
nucj.eoui.ae 
location 
corresponding [ 
to first j 
amino acid 
residue of | 
amino acid 
sequence 1 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVITPSRASESSASSDGPHPVITPSWSPGSDVTLIAEALVTVTN 
I E VINCS ITE I ETTT SSI PGAS DTDLI PTEGVKASS TS DP P ALP 
DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 
PVSIEAGSAVGKTTSFAGSSASSYSPSEAAliKNFTPSETLTMDI 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG*CSSSTGNSTPTRLTSRSPYCVSGEANG/PSAAARHVPYAKR 
GCCP*PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATGPS 
LTSTGVYVWGGAS PVPRGVLGLTLAHVLCFSKEKT 


5703 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLSYRCPWQA 
PK AG TGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RI LRWHP * HTAAR* PRWRRLPSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC* * YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR | 
GSWETAPGS * WCPWL * AARWTGWRTASGAS AGLGRAADRP S AWA 
RRVAGLLPGQGLTVRR * H * TAGAPAS VRS S QGATRSP APGGDQC 
ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 

AGWPRHSPHDTQTPEP | 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKN I KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
OS STAAAQS AS ATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE | 

SPQP i 


5705 


23 


| 562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKN I KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE | 
OS S TAAAOS AS ATDTAT PGAAGGATAAAAS GATS APEGDAARAA | 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 

SPQP J 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD ] 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVEEQRQQQAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLHYE IFVFLLLCS I j 

LFFIIFLF ( 


5707 


j 28 


609 


GSPAPTPGPRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFA I Q PGLAEGGQFLGD PP PGLCQP E LQPDSNSNFMASAKDANE | 
NWHGMPGRVEP ILRRSS SES PSDNQAFQAPGSPEEGVRS PPEGA 
E I PGAEPEKMGG AGTVCS PLEDNGYAS S SLS I DS RS S SPEPACG 
TPRGPGPPDPLLPSVAQA 1 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA || 
QASVSRPHDRA*GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRS S TAA * LHRRAG AG S LCLS AS LLP PS FS LG APGAP S PL 
RVS P ASGGPRKEGRQGS GG * AGGGGP \ARTHADLP CVGFVCS PP j 
LLK* SDS PVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
S RRRRGP * AAGRS TP AVP * P CS *GGAGRRAYACRTG WGYAP SR * ! 
LEPSGPTSGS AL * TWASHSTGA* * SRLCGTAGTGPLCSQSSRS * 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPCPGIPLSGASPGGSGETGAGRSHTLK | 
AARSRLS PRPGSGSRGSY* SHNDNWGTWPAPPSAGHLLVGG*NS | 
QRTSSDH* YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
P PRP PRLPAAAS / SGGASG S P AASC S C S CRAPAKP AS S / GE APA 
PPPRPEPPPPPARRP j 


5709 


2 


2031 

■ 


' ITLCPLPQTEKCLNWTEAATPLGIYLKARVEAGGLKELEISWti j 
LHQIWRWGAWMRAGMGGCRCWGVMAPFAPR/NALSFLVNDCS 
L I HNNVCMAAVFVDRAGE WKLGGLDYMYS AQGNGGG PPRKG I PE | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenvlalanine, G=Glvcine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQYDPPELADSSGRWREKRSADMWRLGCLIWEVFNGPLPRAA 
ALRNPGK I P KTLVPHYCEL VGAN P KVR PN P ARFLQNCRAPGGFM 
SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPEDFCRHK 
VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI IPVWK 
MFSSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIFPHWHGFLDT 
N P A I R EOTVKS M LLLAP KLNEANIiNVE LMKH FAR LO AKD E OG P I 
RCNTTVCLGKIGSYLSASTRHRVLTSAFSRATRDPFAPSRVAGV 
LGFAATHNLYSMNDCAQKILPVLaSLTVDPEKSVRDQAFKAIRS. 
FLS KLESVS EDPTQLEE VEKDVHAASS PGMGGAAAS WAGWAVTG 
VSSLTSKLIRSHPTTAPTETOIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS * RYTAGQR V 


5710 


1 


562 


IPGSTISCEVELMARMAKTIDSFTQNQTRLWI IDGLDACEQDK 
VLQMLDTVRYLFSKGPFIAIFASDPHI I IKAINQNLNSVPSGFK 
\LNGHDYMRN I VHLPVFLNSRGL/RQ/LQENFS * LQQQMETFHA 
QILQGYRKMLTEEFHRTALGR*QNLVARQPSIDG*DAIGFELYV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVLSQQRPSLFHECAFHFFS*SLQRHTINLDQGIF*LLM 
LSEERQHLFESS/ 1 WTTPHNLK* / FE IHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 

• 


1391 

■ 


GRKLFQS LD I S ERLKFLLTLDC VDDTL I VLAEEHGCLD 1 1 KE L P 
ETVIDLIiNKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPFTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 
AGGDLE KE LVNKE 1 1 RS KP P I CTL PN FLFEDGE S FGOGRDRS S / 
TFR*YHWDIWMPAKK*IERCWGRSILPITLKMTSLILPYSNSN 
NELSAAATLPL 1 1 REKDTE YQLNR 1 1 LFDRLLKAYP YKKNQ I WK 
EARVDI PPLMRGLTWAALLGVEGAIHAKYDAIDKDTPI PTDROI 
EVDI PRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
S LCAPFLYLNFNNEALVYACMSAFI PKYLYNFFLKDNS HVI QEY 
LT VFS QM I AFHD P ELSNHLNE IGF I PDL YAI PW FLTM FTHVF PL 
HKIFHLW\DTLLLGEFLFPILYWE 


5713 


: 634 


284 


P VCAVPVDRWPVLPREDQEGQQL* AKLPRDFRR * FQ I LGPMEGH 
TACRCSRRGAQVQHLPREDIRAAE*DPHLREVWPGLPTSSATSP 
* RAVLTS PCSHLGSADAAS SHWLCGVS FH 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KSAWAAAAPASVADDTPPPERRNKSGI I SEPLNKSLRRSRPLS 
HYSS FGS SGGSGGGS MMGGES ADKATAAAAAAS LLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 

QTPPAS KLQGGGGGLQTGWGLHPVPVTAAS PLPRWCLFGAVAK\ 

GIjPGP * LCPSGAA/ GGLQRGPGLSPLGAAGKVSCLHPPSMVENN 

DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS | 

PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 

Q I RRKETKPL* RKTPAG\NNYQSNS I PVSQS PQLTVDLLPSAGR 

TQAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 

YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC* EV\GALGEPVRI PG 

L*PDLSCILSNGSKHRREGLSFPRSLGPGRRGPAGLQSLGCSPT 

PKNTACHSSGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 

VWRWI PLE * LGLSRETGQATRRGIjVWIS PGRAAAACVACAQALE 

EGPIjRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 

GLT/GVPGTDPKRGGRKPGQSGQETQGPTVWSGPESPLQPKP*E 

RQE/VGAGASSGVGLSRGRAGGPSSAWEVAAMLLLLRHGSHSEL 

TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL*SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ | 
ID 
NO: 


Predicted j 
beginning j 
nucleotide 
location j 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence | 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Pnenyl alanine, G=Giycme, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GDSLGARPGLPYGLSDDESGGGRALSAESEVEEPARGPGEARGE 

RPG PACQLCGGPTGEGPCCGAGGPGGG PLJj P PRIjLi x 5 CKJji- x r v 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAQLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCP1 Ujr Kcu 1 f 
RP ARPPS PTEQEGAVPRRPEDALLLPDLSLHVP PGGAS FLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQCjPSDKJbl* At-bliuFr Ai rii fiNnuA 

RHM KTHS GE KP FRCAR C P YAS AHLDNL KRHQRVHTG E KP YKC P L 
CPYACGNLANLKRHGRIHSGDKPFRCSLCNYSCNQSMNLIRHM 


5718 


120 ] 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


48 


428 


ELNNGPFQMPIjCNGGNLAVTGSWADRSPIiHEAASQGRLLAIiRTL 
LSQGYNVKAVTLDHVTPLHE^CLGDHVACARTLLEAGANVNAIT 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 j 


1051 


LQAFRNAS E VPMVLVGTQDAI S AA \NPRVYRRTSRARKLS TDLK 
\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKVVAL\RKKQQ\IA1 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDS IGSGRAI PI KUl* 1 ijljiU<bijAi»iiiNJ\ 
EWKKKYVTLCDNGLLTYHPSLHDYMQNIHGKE I DLLRTTVKVPG 
KRLPRATPATAPGTSPRANGLSVERSNTQLGGGTGAPHSASSAS 
LHS ER PLS S S AWAGPRPEGLHQRS CS VSSADQWS EATT S L P PGM 
QHPASG 


5721 


97 


492 


RHSSPCCSLRRTERSSNAAVST/TTVQQFKRFIENYRRHIGCVA 

VFYAI AGGLFLERAYYYAFAAHHTGI TDTTRVGI ILSRGTAASI 
SFMFSYILLTMCRNLITFLRETFLNRYVPFDAAVDFHRLIASTA 


5722 


88 


I 1043 


VALDVLAGS S PGGGMAGALLGPRVHG I RAVLRV ARGGVQAPGAP 
GSLGVSHAAAP PARPQGAAQS PHRGRRHGGGGAGLP P PRS PRF P 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETLRIjGRGAPLP\PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 

KSSTREIPEMI 


5723 


88 


\ , 1043 

1 4 


VAIiDVLAGSS PGGGMAGALLGPRVHG IRAVIjRVARGGVyAVUA«f 

gslgvshaaapparp'qgaaqsphrgrrhggggaglppprsprfp 
qesvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 
grargqagllgrqgqggrgaereraalqarrgrrpgpepdqscg 

GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 

KSSTREIPEMI 


5724 


3 


i 1841 


■ FTNEAP P APLPDASAS PLS PHRRAKS LDRRSTEPS VTPDLLNFK 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDSVAEEAADLDGEID 
LSACYDVTEYPVQRNYGFQIHTKEGEFTLSAMTSGIRRNWIQTI 
MKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTEKQEAELGEP 
DPEQKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGP 
ADTH\DPWRPEAEHGELERERARRREERRKRFGMLDATDGPGTE 
DAALRMEVDRSPGLPMSDLKTHNVHVE I EQRWHQ VETTPliKiSfc. K. 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 
NRLLQDQLRVALGREQSAREGYVLQATCERGFAAMEETHQKKI E 
DLQRQHQRELEKLREEKDRLLAEETAATISAIEAMKNAHREEME 
P F LEKSORSOIS SVNSDVEALRRQYLEELQS VQRELEVLSEQYS 
QKCLENAHLAQALEAERQALRQCQRENQELNAHNQELNNRLAAE 
ITRLRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
TQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 


5725 


3 


j 104 9 


" VNGHSEETSQSPNRTEPHDSDCSVDLGISKSTEDLSPQKSGPVG 
SWKSHS ITNMEIGGLKIYDILSDN\DLSSHLQPLK/ FTSAVDG 
KNI VRSKAATLLYDQPLQVFTGSSSSSDLI SGTKAI FKFDSNHN 
PE /GAKYNKRPHKWAHNLHLKYMVLHS I ISNTVAV\RSQRHFVA 
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ID 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=*Lysine, 
LeLeucine, M»Methionine, N=Asparagine , 
P=Proline, Q*=Glu t amine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKS PNRPCQFSSSAPS / VDQRAQ/ INQS YAKHSANMNFSNHN 
NVRANTAYHLHQRLG PARHGEMWA I S PNDRLI PAVTRS T IQRQS 
SVSSTASVNLGDPGSTRRAQIPEGDYt»SYREFHSAGRTPPMMPG 
SQRPLSARTYSIDGPNASRPQSARPSINEIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPLGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
QGS PGGSGEG PPLSS PSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPPYLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPI LI LKETRRLPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AGAVTSTPNRNSSKRRSSLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NAIRAGWPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPABEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDLPAGQMSLAPPFPPVAAVIRSNK 


5729 


1 


1525 


AGGAREVLTLQLGHFAGFVGAHWWNQQDAALGRATDSKEPPGEL 
CPDVLYRTGRTLHGQETYTPRLIItMDLKGSLSSLKEEGGLYRDK 
QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGSS PLPTATTPKPLI PTEAS I RVWSDFLRVHLHPRSI 
CMIQKYNHDGEAGRIoEAFGQGESVLKEPKYOEELEDRLHFYVEE 
CDYLQGFQILCDLHDGFSGVGAKAAELIiQDEYSGRGIITWGLIiP 
GPYHRGEAQRNIYRLiLNTAFGLVHLTAHSSLVCPLSLGGSLGLR 
PE PPVS FPYLHYDATLPFHCSAI LATALDTVTCS \ YRLCSS PVS 
MVHL\ADMLS FCGKKWTAGAI I PFPLAPGQSLPDSLMQFGGAT 
PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 
LHACTTGEEILAQYLQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 
SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQAP ARET CVE CQ KTVY PME RLLANQQVFH I S CFRGS YCNNK 
LSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENQGRPLKSFGGEDCPSC*GGCPGSNY*AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGEL I PKDSCYMRKPPRRPKKRRQG / CALPQGCLTFKDVAI 
EFSLEEWKCLNPAQRALYRAVMIjENYRNLESVGLTSKDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTLVCGFTSFSFSLPLYLCGCLRF 
PERTCSQLQQADWAPDFGPSSFVPSWGATATGARKFLIAFNI\N 
LLGTKEQAHR IALNLREQGRGKDQPGRLKKVQGIGWYLDEKNLA 
QVSTNLLDFEVTALHTVYEETCREAQELSLPWGSQLVGLVPLK 
ALLDAA 


5733 


1 


460 


palqevnanalawgkqyendartlfeftsgvndtes pi i yrdes 
mrtacspdglcsdgnglelkcpftsrdfmkfrlggfeaiksaym 
aqvqysmvot^knawyfanydprmkreglhyvvierdek™\as 
fdei\vp\efigkmdevlsrdpm 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLIPAYSKNRAYAIFFIVFTVI 
GSLFLMNLLTAI I YSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 
SMVGEGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 
GSVLLSAEEFQKLFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFD YLGNLIALANLVS ICVFLVLDADVLPAERDDFI LGILNC 
VFI VY YliLEMLLKVFALGLRGYLS YPSNVFDGLLTWLLVLE I S 
TL\ VCTDCHTQAGGRRWW/RLLSLWDMTRMLNMIil VFRFLRI I P 
SMKPMAWASTVLGL 


5735 


2 


540 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARRISL 
ATPRQLYK/ SSNMTQRWQRRE ISNFEYLMFLNTIAGRTYNDLNQ 
YPVFPWVLTNYESEELDLTLPGNFRDLSKPIGALNPKRAVFYAE 
RYETWEDDQSPPYHYNTHYSTATSTLSWLVRIVSIFIEIiACLWY 
LKILT 


5736 


1 


382 


GTRPSTKKSGYSPQQVAVIHCKGHQKENTAVAHSNQKADSAAQV 
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corresponding 
to first 
amino acid 
residue of 
amino acid | 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Giycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L= Leu cine, M=Metnionxne, jN=A6paragine, 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TAKljSVTPPJNLljPTVSrPQrUijFUNPV xbi 1 IhiKljAoULiKAWKW 
QES * * ILPDSGIFIP*T*TSYLQSTTHLRRAKLPQLLRR 


5737 


290 


1041 


KACLHLLSSPLTSNFLFNPLLPDStiySVEARSQRANLGPCRRKR 
LQTLMRLAAGFQYSSHKDPSLSAKEKHTDYHNEARGPWPGWVG* 
RTADGSCGRGPDGAHHPGPKS SSWRASRIiljPGl^GonnJjlJAi VL» 
RDLECGTPAPLQLE I P PQPRGHPAP IPTGQAGPRDSGPGAS P * V 
ETR PLTDGRR * PGVRP VGWT PAH PAGTLRPRGAVE PS VS ACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


8 


460 


DTLS LNCT1.PETLPMTPS F *LS FL* FPGLARAKS I PTKTYSNEV 
VTLW YRP PD I LLGSTD YSTQ I DMW * GQVEVWQGP CGKGGGL VTT 
ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FR I LSEE AWALCAVBTHR 


5739 


1 


1222 


S FQRRGIRWWVHTLHPHPRAVWAGIGRGHGS * ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDLLAEV 
SAE VDGPVPG YLSS PQSI TDTCLY I FTSGTTGLPKAARISHIjKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRIAVGSGLRPDTWERFVRRFGPIjQVIjETiGIjI£^ ±IN i 
TGQRGAVGRASWLYKHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
PGEPGLLVAPVSQQSPFLGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGD PFRWKGENVATTE VAE VFEALD F 
LQEVNVYGVTV 


5740 


265 


231 


payw lkv p tlclksktdlre kashvs aqlqge vrgiiagal wm* a 
YWERVYN*NISRMVHALEQKRHPAGLSSSMALQIjNPCIjGMLMA 
LQS ELHKLYDEETQSWVSGSACGG YP 


5741 


1 1 


650 


PRKTMRRGVLMTLLQQSAMTLPIiW IGKPGDRPP P JjCGAI PASGU 
YVARPGDKVAARVKAVDGDEQW I LAE WS YSHATNKYEVDD IDE 
EGKERHTLS RRRV I PLPQW KANPETDP EALFQKE QLVLAL YPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPUSTVAQRYW 
ACKEPKKK*CRLADSPSPNDTGQDSRGRAGI KH1 ffLiAlUx 


5742 


2 


362 


TQSVKE I LKRNPNVNLTDKDGNTALM IAS KEGHTE I VQDLLDAG 
TYVN I PDRSGDTVL I GAVRGGHVE I VRALLQ KYAD I DIRGQDNK 
TALYWAVEKGNATMVRDI LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 
KATGRE I SPREKTPEVIDATEEIDKDLEETGRRE I SPEENGPEE j 
VKPVDEMETDLKTTGREGS SREKTREVI DAAisvl BT. DLiEET. EKb 
ISPQE 


5744 


3 


I 703 


TRRTTTTSPTTTRQMTTTPAALPTTWTTPDLTTGTPIiQMTTIA 

VFTTANTCIjS LTP S TLPEEATGIiLTPE PS KEGP I LTAES ETVLP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGI PMSMKNEMP ISQLLMI IAP 
SI^FVLFALFVAFLLRGKLMETYCSQKHTRLDYIGDSKNWiNDV 

QHGREDEDGLFTL 


5745 


1400 


\ 599 


GKSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGLEPKTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 
YICGSHGVEHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPEEGR 
E KSEEERS KHKRKKS CEE IDLDKHKS I QRKKTEVE I ETVHVS TE 
KLKNRKEKKSRDWSKKEERKRTKKKKEQGQERTEEEMLWDQSI 

LGF 


5746 


3 


|| 821 


S FASGRLTP S S P AFDGE LDLQR Y SNG PAVS AWSLGMGAVSW S E S 
RAGERRF P C P VCGKRFRFNS I LALHLRTHQ P ERPRS P AARLLLE 
t PK , T?UT.T.PR23iT?T.f^T?AP <: i c ?GnMOATPATEGIiARPOAPSSSAFRCP 
YCKGKFRTSAERERHLHILHRPWKCGLCSFGSSQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EATPTPAPAAPEEPPAPPEFRCQVCGQSFTQSWFLKGHMRKHKA 

SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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to first 
amino acid 
residue of 
amino acid 
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I Predicted end 
nucleotide 
location 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y»Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


> 


• 




SDNGDIfTJfDiVHELSLEMKRQKIQREIiMKLEQENMEKREEIIIK 
KEVSPEVVRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 
AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
KYKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTSPAGQHHSP 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
AS PYP3HSLSS PQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


5748 


934 




S EG PQ VFYKGLAPTL I AI FPYAGLQFSCYSSLKHLYKWAI PAEG 
KKNENLQNLLCGSGAGVI SKTLTYPLDLFKKRLQVGGPEHARAA 
FGQVRR YKGLMDCAKQVLQKEGALGFFKGLS PSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSIiAS 
SASSTYSSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVIQGALNASETTPKELRI KRQNSSDS I SSLNS ITSHSS I 
GSSKDADA 


5750 


22 

• 


866 


IFISICLWNAHLCFLLLPKDCIDQVMKLQNLFVDDSGRYIiAIQF 
HLEWAYVFLYYYE YRKAKDQLDIAKD I SQLQ IDLTGALGKRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQBHIjTKNLELNDDT 
ILNDI KLADCEQFQMPDLCAEEIAI I LGICTNFQKNNPVHTLTE 
VELLAFTSCLLSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 
TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSSALQI FEKLEMWE 


5751 


3 


751 


S CG S ALRAWRCGAAALATFPAPALPGLM YRAL YAFRS AE PNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRIjQGLEQ 
DVLQAIDRAI EAVHNTAMRDGGKYSLEQRGVLQKLI HHRKETLS 
RRGPS AS S VAVMTS S TSDHHLDAAAARQPNGVCRAG FERQHSLP 
SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 | 

t 


471 


GP VCGVGLS VAWAG P WRG P VHS VGGGGRAALHGAEL PCLSGAAT 
VEREMELRHKNEMLRVETEARARAKAERENADI IREQIRLKASE 
HRQTVLES IRTAGTLFGEGFRAFVTDRDKVTATVNI FI KQGWQV 
AERQHVGAS WS PRSCPCRLCTAL 


5753 


34 

• 1 


4B3 


DDSXAIPGGVQAP FGAVRN I YTPRTGHRI RKLDQI QSGGNYVAG 
GQEAFKKLNYLDIGEI KKRPMEVVNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLLIPRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 i 


331 


TLVHVVEFAGEHAEAI ASREQEVIiQGWKELLSACEDARLHVS ST 
ADALRFHSQVRDLLSWMDGIASQIGAADKPRCPSSLLGLPASPW 
WPTPATPS PLTAPFSME 


j 5755 


3 


888 


LGDQF YKEAIEHCRS YNSRLCAERSVRLP FLDSQTG VAQNNCY I 
WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDP KLRLLE I K 
PEVELPLKKDGFTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDIPKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPYVCDICGKRYKNRPGLSYHYAHTHLASEEGDEAQ 
DQETRSPPNHRNENHRPQKGPDGTVIPNNYCDFCLGGSNMNKKS 
GRPEELVSCADCGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


SSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVAR 
WNRRHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRHGLYSR 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGRIVNVTKEIL 


3/5 / 1 


T 1 

3 | 


473 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ 
LS I SQSVHVAVKVP PLI QPFEFPPAS IGQLL YI PC WSSGDMP I 
RITWRKDGQVIISGSGVTIESKEFMSSLQISSVSLKHNGNYTCI 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 I 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSG I RCVA 
YNNQSNRLAVSRTDGTVEIYNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGE IME YDLQALNI KYAMDAFGGP I WSMAAS P 
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amino acid 
residue of . 
amino acid 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pnenyl alanine, G=Giycme, 
H=Histiaine, i=±soieucine, k.s= lysine , 
Ii=t,eucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGSQLLVGCEDGSVKLFQITPDKIPV 


5759 


2 


1240 


GNAAFAGQG WYET FHMS DL P S Y TTNGT VHWVNNQ IG FTTDPR 
MARS S P YPTDVARVVNAPI FHVNADDPEAV I YVCS VAAE WRNTF 
NKDVGADLVCYRJRRGHNEMDEPMFTQPL 

DKLIAEGTVTLQEFEEEIAKYDRI CEEAYGRSKDKKILH I KHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
KI HTGLSR I LRGRADMTKNRTVDWALAE YMAFG SLLKEG I HVRL 
NGQDVERGTFSHRHHVLHDQEVDRRTCVPMNHIiWPDQAPYTVCN 
S S LS E YGVLGFELGYAMAS PNAL VLWEAQFGDFHNTAQC 1 1 DQF 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDFEVSQL 


5760 


1 


1221 


VRDITSDSLSLSWTVPEGQFDHFLVQFKNGDGQPKAVRVPGHED 
GWISGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PASTEPPTPEPPIKPRLEELTVTDATPDSLSLSWTVPEGQFDHF 
LVQYKNGDGQPKATRVPGHEDRVTISGLEPDNKYKMNLYGFHGG 
QRVGPVSAI GVTAAEEETPTPTE PSMEAPE PPEE PLLGELTVTG 
SS PDS LSLSWTVPQGRFDS FTVQ YKDRDGRPQWR VGGEES E VT 
VGGLEPGRKYKMHLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TEPGTEAPEPPEEPLLGELTVTGSSPDSLSLSWTVPQGRFDSFT 
VQYI05RDGRPQATOVGGQESKVTVRGLEPGRKYKMHLYGLHEGR 
RLGPVSAIGVT 


5761 


3 


1275 


SCDMAEAAALVWIRGPGFGCKAWCASGRCTVRDFIHRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ I EKTTNREACRDLSGRRLRDVNHEKAMAE WVKQQAERE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHBMAERLEDSVLK 
GMQAASSKMVSAEISENRKRQWPTKSQTDRGASAGKRRCFWLGM 
EGLE TAEG SNS ES SDDDS EEAP S TS GMG FHAPK I GS NGVEMAAK. 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILEDSCAELGESK 
EHMESRMVTETEETQEKKAESKEPIEEEPTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGQTPLHSQGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MS S E EAANGKKSHWAELE I SGKVRSLSAS LWSLTHLTALHLSDN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKDTGL I MLIARLD YEL IQRFTLTI I ARDGGGEETTGRVR IN V 
LDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 
ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMRQLLRP IDRQRYDENEDLSDVEE IVSVRG FSLBEK 
LRSQLYQGDFVHAMEGKDFNYE YVQREALRVPLI FREKDGLGI K 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKij 


5765 


3 


825 


QKI LRLNNSHQPPTS SSNSKDCGGPASSGAGATAALADGLKFAS 
VQASAPQGNSHKETS KSKVKRSKTSKDANKSLPS AALYGI PE I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGS CGKSKEE KPGKSQS S RGAKRD KDAGKSRKD KHDLLQ 
GHQNGSGS QAPSGGHLYGFGAKSNGGGAS PFHCGGTGSGS VAAA 
GEVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 


5766 


1608 


663 


S G L FS VD P AS S QAMELSDVTL I EGVGNEVM V VAG W VJL X JUAXjVLt 
AWLSTYVADSGSNQLLGAIVSAGDTSVLHLGHVDHLVAGQGNPE 
PTELP HPS EGNDEKAE EAGEGRGDS TGEAGAGGGVE PS LEHLLD 

TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFWLLGWWYFRINYRQFFTAPATVSLVGVTVFFS FLV 

FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDWRALMKRKRMKANIKLVG 
SGFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re so ond i no - 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine f D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 

H=Histidine> T — T qnl ck-ii r* "i n pa Tf ~ T ,\r<z i no 

L=Leucine r M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=*Threonine , V=Val ine , 
W=TrvotoDhan. Y=TVrosine X=Unlcnown *-chnn 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HIDEFFTLNSTPSRSAYDEPHLLVNIEKQKLELBKRRLDIEABR 
LOVEKERLO I EKERLRHLDMEHERLOLE KERLOTERFK T ,PT n T V 

NSEKPSLBNELGQGE^MrjQPQDIETEKLKLERERIiQLEKDRLQ 
FLKFESEKLQ I EKERLQVEKDRLR I QKEGHLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRIjGSAVTSQRAGPA 

AAiWAKDYPFYLTVICRANCSLELPPASGPAKDAEEPSNKRVKPL 
S RVTS LANL I P P VKAT P LKR PS OTT .OR ^ T Q PP <5 p QR pn T T . a T3P D 

WSRNAAPSS TKRRDS KLWSETFDVC 


5769 


38 


667 


TKTKKGVKEKATDQSVKAFAEHCPEIiQYVGFMGCSVTSKGVIHL 
TKLRNLSSLDLRHITEL^^ 

GWCKEI TDQGATLIAQSS KSLRYLGl^CTKVNEVTVEQLVQQY 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVKTRKWSFLLEEHSKLIAKVRCXPQVQLDPLPTTLTIA 
FAS QLKKTS LS LTPDVPEADLS BVDPKLVSNLM PFQRAGVNFAI 
AKGGRLLLADDMGLGKT I OAI C I AAF YRKEWP"LLVWP<? 9 VP FT 

WEQAFLRWLPSLS PDC IiWVVTGKDRIiTA 


5771 


168 


741 


GLL PSACLRARS WRE AS EG PSS RA C <3 NCiCjO FIT PR ACV ci r3T <5 TP c; 

FHGSHCSGSDHSSU3LEQLQDYMWLRSKLGPLEIQQFAMLLRE 
YRLGLPIQDYCTGLLKLYGDRRKFLLLGMRPFIPDQDIGYFEGF 
LEGVGIREGGILTDSFGRIKRSMSSTSASAVRSYDGAAQRPEAQ 
AFHRLLAD I THD I E 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
II/T FSNTTjVT CS A T YHT »P VP P RR P, Pfi C Q MP nT.P XTh. 

JL J.I -1 JL tJilUV J. \<UA1 J- CTJLi JT V C C CiIUj jz VJ v»Ot XCvXvXJX^ VA 


5773 


2 


723 


PRVRSIOINFCFMEMNTRLQVEHPVTEMITGTDLVEWQLRIAAGE 
KIPLSQEEITLQGHAFEARIYAEDPSNNFMPVAGPLVHLSTPRA 
DPSTRIETGTOQGDFA/SVHYDPMIAIUjVWAADRQAALTKLRYS 
LRQYNI VGLHTNI DFLLNLS GHPE FEAGNVHTDF I PQHHKQLLL 
S RKAAAKESLCOAAT jT5Tj TTiKFJCAMTnTFTT iTH\wnn PQppe e e en 

RRLNISYTRNMTLKDGKNSK 


5774 


2 

* 


592 


FVEEENIRWRCGGSELNFRRAVFSADSKYIFCVSGDFVKVYST 
VTEE CVHI LHGHRNLVTG I QI^PNNHLQLYSCSLDGTI KLWD YI 
DGILIKTFIVGCKLHALFTI^QAEDSVFVIVNKEKPDIFQLVSV 
KLPKSSSOEVEAKEL^ FVLDYTNOS PKC?T APGNEGVYVaaVPPP 

YLS VYFFKKETTS RVTLSS S 


5775 


3 


538 


SSGCCDPAAPSSIjAFAATMPVSKCPKKSESLWKGWDRKAQRNGL 
RSQWAWGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RIXjYGTLSLPDOOTGKCRRVYSGWWKnnKK^RYnTnFRnPTfPVV 

EGDWCGSQRSGWGRMYYSNGDIYEGQWF^KPNGEGMLRLSQNP 
RP 


5776 


2 


484 


RLPQDCVCQNLSESI^TLCPSKGUjFVPPDIDRRTVELPiGGNF 
I IHISRQDFANMTGLVDLTLSRNTISHIQPFSFIxDLESLRSIiHL i 
DSNRLPS LGEDTLRGLVNLQHL I VNNNQLGGIADEAFEDFLLTL 
EDLDLSYNNLHGPAVGLRGDAWVQPSTS 


Sill 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGP^PRJjGKIiRFQNDHLSVLKQ 
vkkleo alkdgs agld pol pgtc ys phc p pdkaeags tlpent^g 

GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSIjENV 

YRGSEGSPTKPFINPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 

NLPPLPSLPPPPLPSSPPPSSV^PJLgWTGRQKSSADHRKSra 

DLLQSSSESSRVDWYAQTKLGLTRTLSEENVYEDILDPPMKEKP 

YEDIELHGRCIXjKKCVIiNFPAS PTSS I PDTLTKQSLS KPAFFRQ 

NSERRNV 


5778 


1 


1210 

* 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 

GGPCWLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGEPLTSPP 

WAPMAPERPEHLLNRVIjERI^GGATRDSAASDILI^ 

LFLPTEKFLQEMQYFVRAGGMEGPEGLGRKQAGjAMLIjHFIjDT 

YQGLLQEEEGAGH 1 1 KDLYLL I MKDESLYQGLREDTLRLHQLVE 

TVELKIPEF^QPPSKQVKPLFFJIFRRIDSCLQTRVAFRGSDEIF 

CRVYMPDHSYVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFACTRDSYEALV 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysfceine, D=Aspartic Acid, E= 

rife _ ■ « ^_ "1 *• • ^ • 

Glutamic Acid, F^Phenylalanme, G=Glycme, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methiomne, N=Asparagme , 
P=Proline , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan / Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVEPEDVANHLTAFHWELFRCVHELEFV 
DYVFHGE 


5779 


138 


1671 


E AVQ VL I KH S AD VNARDKN WQT PLHVAAAN KAVKCAEVI IPLLS 
S VNVS DRGGRTALHHAALNGHVEMVNLLLAKGAN INAFDKKDRR 
ALHWAAYMGHLDWALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHLLNLGVE IDE I NVYGNTALH I ACYNGQDAWNEL I DYGA 
NWQPNNNGFTPLHFAAASTHGALCLELLVNNGADVNIQSKDGK 
S PLHMTAVHGRFTRSQTLIQNGGE I D CVD KDGNT P LHVAARYGH 
ELLINTLITSGADTAKCGIHSMFPIiHLAALNAHSDCCRKLLSSG 
QKYSIVSLFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVECI 
KLLQS S GADFHKKDKCGRTPLHYAAANCHFHCI ETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLEFLLQNDANPS IRDKEGYNS IHYAAAYGHRQCLELLLE 
RTNSGFEESDSGATKSPLHLAVSEMP 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEKSEPVS 
EIETSVVKGSHFPVGVVPPRAKSPTPESSTIASYVTLRKTKKMM 
DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHQQACLREK 
KKGLNVIGASDQSPLQS PSNLRDNP 


5781 


19 


941 


RGSLGGHPWR^PMRAASQGCLPVSFVTGPHQERAYGGRGPGGAF 
PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPL<GQ 
VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 
VAPSRTCGGC* TWDPALLVS P/PQGDSTPELPAP\QQPTGGPSR 
CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 

■ 


1237 


DRSMMSMAADSYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 

S QPE PP VSQSE I S E PS AVPTD YS VSASDPSVLVS EAAVTVPEP P 

PEPESS ITLTPVESAWAEEHEWPERPVTCMVSETPAMSAEPT 

VLASEPPVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVLA 

ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVIj 

EPS WTVPEPPWAEPDYVTI PVPWSALEPSVPVLEPAVSVLQ 

PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 

ILESSIMSSHVMKGINLSSGDQNDAPEIGMQEIALHSGEEPHAE 

EHLKGDFYESEHGINIDLNINNHIiIAKEMEHNTVCAAGTSPVGE 

IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 

ATG\TS KGI E FTTASTLSL WKYDVDLSLTTQDTEHDMLI STS P 

SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTOEPIjPVKRD\DQ 

TLAALI\SLKESSGGEKEVPPPS*REHLPDSGFSANIEDINEAD 

LVRPVSSPRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPW 

\SSMP\ERASGS\SSGEKGG\YEIF^KraDTHEKSKKNKNRDKG 

EKEKKRDSSLRSRSKRS KSSEHKSRKLTS ESRSRARKRS SKS KS 

HRS \ QTRSRS RS /RDRRRRS S RS RS KS RGRRS VS KEKRKRSP KH j 

RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 

SVGRRRS FS IS PSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 

RTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 

RS P I RRKRSRS S ERGRS PKRLTDLD KAQLLE I AKANAAAMCAKA 

GVPLP PNLKPAPPPTI EEKVAKKSGGAT I EELTEKCKQIAQSKE 

DDD VI VNKPHVS DEEE EE P PFYHH P FKLSEPKPIF FNLNI AAAK 

PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 

KDDDNVFSSNLPSEPVDISTAMSERAIiAQKRLSENAFDLEAMSM 

LNRAQERIDAWAQLNS I PGQFTGSTGVQVLTQEQLANTGAQAWI 

KKDQ FLRAAP VTGGMGAVLMRKMGWREGEGLGKNKEGNKE P I L V 

DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 

KRRWQPPEFLLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 

Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRSPRGLSHSPWAVKKINPICNDHYRSVYQ 
KRLMDEAKI LKS LHHPN I VG YRAFTEANDGS LCLAME YGGE KSL 
NDLIEE/PI*SQ/PKIIiFQQP/LILKVALNMARGLKYLHQEKKL 
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SEQ Predicted 
ID beginning 
NO: nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide ~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



LHGDIKS SNWI KGDFET I KI CD VGVS LPLDENMTVTDPEACY I 
GTEPWKP KEAVEENGV I TDKAD I FAFGLTLWEMMTLS IPHINLS 
NDDDDEDKTFDESDFDDEAYYAALGTRPP INMEELDESYQKVIE 
LPSVCTNEDPKDRPSAAHIVEALETDV 



5784 



2669 



1388 



P RVRPRVRTDHNYY I S R I YG P S DS ASRDL WVN IDQME KDKVK I H 
G I LSNTHRQAARVNLS FDF PF YGHFLRE ITVATGGFI YTGEWH 
RM L»TATQ Y I APLMANFD PS VS RNS TVR YFDNGTAL WQWDHVHL 
QDNYNLGSFTFQATLLMDGRIIFGYKEIPVLVTQISSTNHPVKV 
GLSDAFVWHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSiaQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTEPVET\FDEPPQP*ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VG I LI LVL I VATAI LVT VYM YHH PTSAAS I FF I ERRPSR 
WPAM KFRRGS GHPAYAE VE P VGEKEGF I VSEQC 



5785 



2669 



1388 



T786 



PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH 
G ILSNTHRQAARVNLS FDFPF YGHFLRE I TVATGGF I YTGEWH 
RMLTATQYIAPLMANFDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I P VLVTQI S STNHPVKV 
GLSDAFVWHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E /DAVTSQF PTS LPTE DDTKIALHLKDNGAS TDDSAAE KKGGTL 
HAGLIVGILILVLIVATAILVTVYMYHHPTSAASIFFIERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 



2532 



1674 



S YKL PAAE RRAS S CS Q P P T PTRRRW PAPGRTS RGHRPQM * SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 
S *H* KRNLSQRSSSMSRRPLSCARPHR* * RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 
SLAPS SRP / PKGRPQCTW I PSRWPAS PTAP PTTT* APTS S PGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP * WRPSGRLSTV* RA 
TGGSTATAPPKRFPRNWNPMMAE 



5787 



1460 



MASAASVTSLADEVNCP\ICQGTLKEAGSLSNCG/HKNFCRACL 
T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWQLANV 
VENIERLQLVSTLGLGBEDVCQEHGEKIYFFCEDDEMQLCWCR 
EAGEHATHTMR FLEDAA\ AP YREQ IHKCLKCL I KEREE I QE I QS 
RENKRMQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQLES 
QDGDI LRQRDE FDLLVAGE I CRFSAL I EELEEKNERPARBLLTD 
IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKMF 
LEKLCFELDYBPAHISLDPQTSHPKLLLSEDHQRAQFSYKWQNS 
PDNPQRFDRATCVLAHTG I TGGRHTWWS IDLAHGGSCTVGWS 
EDVQRKGELRLRPEEGVWAVRLAWGFVSALGS FP \TRLTLKEQP 
RQVRVS LD YE VG W VTFTNAVTRE P I YT FTAS FTRKV I PF FG LWG 
RGSSFSLSS 



5788 



6860 



EHSVSGRSSAYGDATAEGHPAGPGS VS SSTGAI STTTGHQEGDG 
SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 
AI PYMQVI LMLTTDLDGEDEKDKGALDNLLSQLIAELGMDKKDV 
S KKNERS ALNEVHLVVMRLLS VFMSRTKSGSKSS ICESSSLI SS 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSSPPDMSPFFLRQYVKGHAADVFEAYTQLLTEMVLRLPYQI 
KKITDTNSRI PPPVFDHSWFYFLSEYLMIQQTPFVRRQVRKLLL 
F I CGS KEKYRQLRDLHTLDS \HVRG I KKLLEEQG I FLRASWTA 
S PQSALQ YDTL I S LMEHLKACAE IAAQRTINWQKFC I KDDS VLY 
FLLQVS FLVDEGVSPVLLQLLSCALCGSKVLRALAASSGSSSAS 
SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 
LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 
I YRNSS KSQQELLLDLMWS I WPELPAYGRKAAQFVDLLGYFSLK 
TPQTE KKLKE YSQKAVE I LRTQNH I LTNH PNSN I YNTL SGL VE F 
DGYYLESDPCLVCNNPEVPFCYIKLSSIKVDTRYTTTQQWKLI 
GSHTISKVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 
RWHPCAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 
TETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF 
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SEQ 
ID 

NO: 


Predicted j 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence j 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Pnenylalanine, G=Giycme, 
H=Histidine, I=lsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




* 


• 


LCNACGFCKYARFDFML YAKPCCAVDP I ENEEDRKKAVSN INTL 

LDKADRVYHQLMGHRPQLENLLCKVNEAAPEKPQDDSGTAGGIS 

STSASVNRYI LQLAQEYOGDCKNSFDELSKI IQKVFASRKELLE 

YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 

CASAVTEHCITLLRALATNPALRHIIjVSQGLIRELFDYNLRRGA 

AAMREEVRQLMCLLTRDNPEATQQMNDLI IGKVSTALKGHWANP 

DLASSLQYEMLLLTDS ISKEDS CWELRLRCALSLFLMAVNI KTP 

VVVENITIiMCLRILQKLIKPPAPTSKKNKDVPVEALTTVKPYCN 

E IHAQAQLWLKRDPKASYDAWKKCLP IRG I DGNGKAPS KSELRH 

LYLTEKYVWRWKQFLSRRGKRTS PLDLKLGHNNWLRQVLFTPAT 

QAARQAACTI VEALAT I PSRKQQVLDLLTS YLDELS I AGECAAE 

YLALYQKL I TS AHWKVYLAARGVLP Y VGNL I TKE IARLLALEEA 

TLSTDLQQG YALKSLTGLLSS FVEVES I KRHFKSRLVGTVLNGY 

LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFMAVCI 

ETAKRYNLDDYRTPVF I FERLCS 1 1 YPEENEVTEFFVTLEKDPQ 

QEDFLQGRMPGNPYS SNEPGIGPLMRD I KNKI CQDCDLVALLED 

DSGMELLVNNKI I SLDLPVAEVYKKVWCTTNEGEPMRI VYRMRG 

LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 

RLAGIRDFKQGRHLLTVliLKLFSYCV'KVKWRQQLVKLEMNTLN 

VMLGTLNLALVAEQESKDSGGAAVAEQVLS IMEI \ IQAEPNVEP 

LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLQGLLRIIP 

YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 

IAAGIK\NNSNGHQIi\KDL\ILQKGITQNAIiD\YMKKHIP/SAA 

RI WDAD I \ WKS FCLRPALPF ILRLLRGLAIQHPGTQVLIGTDS I 

PNIiHKLEQVS\SDEGIGTIiA\ENL\LESLREHPDVNKKIDA\AR 

RETRAEKKRMAMAMRQKALGTLG \MTTNE KGQ WD / TRTALLEA 

DWEELIEEP\GLTCCICREGYKFQPTKVLGIYTFTKRWLGGVW 

ENKPRETSRATSTVSHFNIVHYDC\HliA\AVSLARGREEWESAA 

LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 

TYQLNIHDIKLLFLRFAMEQSFSADTGGGGRESNIHLIPYIIHT 

GLYVIiNTTRATSREEKNLQGFLEQPKEKWVESAFEVDGPYYFTV 

LALHILPPEQWRATRVEILRRLLVTSQARAVAPGGATRLTDKAV 

KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTEGGWSCSLAEYIR 

HNDMPIYEAADKAIiKTFQEEFMPVETFSEFLDVAGIiLSEITDPE 

SFLKDLLNSVP 


5789 


1 


| 2407 


LPLHAVEKTGRPGQPALKMPGKLRSDAGLESDTAMKKGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEE I DAPKPKKMKKEKEMNGETREKS PKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFPI QAKTFHKVYSGKDL I AQARTGTGKTFSFAI PL 
IEKLHG\ELQDRKRGRAPQVLVLAPTRELANQVSKDFSDITKKL 
S VACFYGGTPYGGQFERMRNG I DILVGTPGRI KDHI QNGKLDLT 
KLNHWLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHWVFNVAKKYMKSTYEQVDLIGKKTQKTAI TVEHLAI KCH 
WTQRAAVIGDVIRVYSGHQGRTIIFCETKKEAQELSQNSAIKQD 
AQSLHGD I PQKQRE ITLKGFRNGSFGVLVATNvAARGLDIPEVD 
LVIQSSPPKDVESYIHRSGRTGRAGRTGVCICFYQHKEEYQLVQ 
VEQKAG I KFKRIGVP SATE 1 1 KASSKDAI RLLDSVPPTAI SHFK 
QSAEKLIEEKGAVEAIAAALAHISGATSVDQRSLINSNVGFVTM 
ILQCS I EMPNISYAWKELKEQLGEEIDSKVKGMVFLKGKLGVCF 
DVPTAS VTE IQEKWHDSRRWQLS VATEQPELEGPREGYGG FRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 


5790 


3786 


j 1585 


ARRQRDPLQALRRRNQELKQQVDSLLSESQLKEALEPNKRQHIY 
QRCIQLKQAI DENKNALQKLS KADESAP VANYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEEKEENES HKW STGEE Y I AVGD FTAQQVGDLTF KKGE I 
LL V I EKKPDGWW I AKDAKGNEGL VP RTYLEP YS EEEEGQE S SEE 
GS EEDVE AVDETADGAE VK\QRTDP HWS AVQKAI SEAGI FCLVN 
HVSFCYLI VLMRNRMETVEDTNGSETGFRAWNVQSRGRI FLVSK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

1 nucleotide 

| location 

1 corresponding 

I to first 
amino acid 
residue of 
amino acid 

| sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y»Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVLQQ INTVDVLTTMGAI PAGFRPSTLSQLLEEGNQFRANYFLQ 
PELMPSQLAFRDLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 
GMS IQVLSRHVRLCLFDGNKVLSNIHTVRATWQPKKPKTWTFS P 
QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGERG 
ELSCGWVFLKLFDASGVPI PAKT YELFtiNGGTPYEKG I EVDPS I ! 
SRRAHGSVFYQIMTMRRQPQLLVKLRSIiNRRSRNVLSLLPETIjI 
GNMCSIHLLI FYRQILGDVLLKDRMSLQSTDLISHPMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKEFLKVPRFLLVYH 
\GCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLSPDGVHEPFDLSEQTTOFLGEMRKNAV I 


5791 


3 


163^ 


LRVAE FAGTS R/ IGAGLIQPLHRAPARDHGLLRGGAAPALSVSH 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 
NVKLKAQTYELQESNVQLKLTIVNTVGFGDQINKEESYQPIVDY 
I DAQ FEAYLQEELKI KRS LFT YHDSRIHVCLYFI S PTGHS LKTL 
DLLTMKNLDSKVYI I P VI AKADTVS KTE LQKFKI KLMS ELVSNG 
VQIYQFPTDDDTXAKVNAAMNGQLPFAVVGSMDEVKVGNKMVKA 
RQYP WGWQVENENHCDF VKLREML I CTNMEDLREQTHTRHYEL 1 
YRRCKLEEMGFTDVGPENKPVSVQETYEAKRHEFHGERQRKEEB 
MKQM FVQRVKEKEAI LKEAERELQAKFEHLKRLHQEERMKIiEE K 
RRLLEEEIIAFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 
FVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITSIFGEQPQ | 
LLI FMEKYFQVQGQYISQSE 


5792 


2263 


653 

• 


AAAAP S PAW W CG VF VVYVVHTCW VMYG I V YTRP C SGDAS C I QP Y 
LARRPKLQL\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VPKKTRNNGTL YAYI FLHHAGVLPWHDGKQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL ] 
NVMM5NFVFDGSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN j 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FG FSE KDADE VKG I FVDTNL YFLALT FFVAAFHLL FD FLAFKND 
ISFWKKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP j 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY j 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 1 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF ! 
IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 

• 


2263 


653 


AAAAPSPAWWCGVFVVYVVHTCWVMYGIVYTRPCSGDASCIQPY 
LARRP KLQL \ RHS FTTTRS HLGAE NN I D L VLNVEDFD VE S KFER 
TVNVS VP KKTRNNGTLYAY I FLHHAGVL PWHDGKQVHLVSPLTT 1 
YMVPKPEE INLLTGESDTQQ IEADKKPTSALDEP VSHWRPRLAL 
NVMADNFVFDGS SLPADVHR YMKM I QLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FGFSEKDADEVKG I FVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I SFWKKKKSM IGMSTKAVLWRCFSTWI FLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMT I FWRGLMPE FQFGTYSESERKTEEY 
DTQAMKYLS YLL YPLCVGGAVYSLLNTKYfCSWYS WL TNSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD j 


5794 


1 j 


5016 


MG PRLS VWLLLL PAALLLH EEHS RAAAKGG CAG S GCGKCDCHGV 
KGQKGERGLPGLQGVIGFPGMQGPEGPQGPPGQKGDTGEPGLPG 
TKGTRGPPGASGYPGNPGLPGIPGQDGPPGPPGIPGCNGTKGER 
GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQGPVGPPGFTGPPGPPGPPGPPGEKGQM 
GLS FQGPKGDKGDQGVSGP PGVPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD j 
RGFPGTSLPG PSGRDGLPGPPGS PGPPGQPG YTNG I VECQPGP P 
GDQGPPGIPGQPGFIGEIGEKGQKGESCLICDIDGYRGPPGPQG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK | 
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j SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide j 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M»Methionine , N^Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGFPGGPGSPGLPGPKGEPGKIVPIiPGPPGAEGLPGSPGF 
PGPQGDRGFPGTPGR\PGIi\PGEKGAVG\QPGIGFPGPPGPKGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGL 
PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 
LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 
PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 
AGPPGIGIPGLRGEKGDQGIAGFPGS PGEKGEKGS IGIPGMPGS 
PGLKGS PGSVGYPGSPGLPGEKGDKGLPGLDG I PGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGLAGSPGI PGSKGEQGFMGP PG PQGQPGLPGS P 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 
QGPKGLPGLQGIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 
VTRHSQT IDDPQCPSGTKI LYHGYSLLYVQGNERAHGQDLGTAG 
SCIjRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 
ITGE3^IRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 
GYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCN 
YYANAYSFWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 
RT 


5795 


1192 


61 


STRS PT VE Y I SAHPH I LFMLLKG YEAPQ I ALRCG I MLRECI RHE 
PIiAKI ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTI FEDYEKLLQS ENYVTKRQSLKLLGELI LDRHN 
FAIMTKY I SKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS PH 
KTQP I VE I LLKNQPKLI EFLS S FQKERTDDEQFADEKNYLI KQ I 
RDLKKTAP * RALRDS KR 


5796 

* 


2 

i 


1078 


GRVGWELWCMYISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKF 
FGEIGLLDPGMDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKP YNSNIGFYTKRNALRVAEVWMDDYKSHVY IAWNLPLENP 
G I D I GDVS ERRALRKS LKCKNFQ WYIiDHVYPEMRR YNNTVAYGE 
LRNNKAKDVCIiDQGPLENHTAILYPCHGWG PQLARYTKEGFLHL 
GAI^TTTLLPDTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQ 
NGAIMNKGTGRCLEVENRGLAG IDLiI LRS CTGQRWTI KNS I K*R 
EGAGAI^PGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 


5797 


2 


891 


PRVRQKTLVDVTLENSN I KDQ I RNLQQT YEASMDKLREKQRQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 
KHSAEKEALIiEETNSFLKAIEEANKKMQAAEISLEEKDQRIGEL 
DRJjI ERME KERHQLQLQLLEHETEMSG EJjTDS DKLk x QU-Lia HAo 
ASLRERIRHLNDMVHCQQKKVKQMVEEIESLKKKLQQKQLLILQ 
LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 
SQTGRTRE IVMPS RNYTPYTRVLELTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNQEKQPYYEEQARLSKIHLEKYPNYKYKPRPKR 
TCIVDGKKLRIGEYKQLMRSRRQEMRQFFTVGQQPQIPITTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGS LAGNEM INGEDEMEM YDDYEDDP KSDYSSENEAPEAVSAN 


5799 


2679 


1435 


LLSTYI KFINIjFPETKATIQGvJai^GoyijKNAJJ V£*LiyUKAV£i I Li 
TLS S VAS TDVLATVLEEMP P FPERE S S I LAKLKRKKGPGAG SAL 
DDGRRD P SSND INGGME PT P S TVS TP S P S ADLLGLRAAP P P AAP 
PAS AGAGNLLVDVFDG PAAQPSLGPTPEEAFLS PGPEDIGPP I P 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQIAVQTKRVAAQVDGGAQVQQVL 
N I ECLRD FLTP PLLS VRFR YGGAPQALTLKL P VT I NKFFQ PTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W =Tryp t ophan , Y=Tyxosine # X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








LLDNVDPNPBNFVGAGIIQTKALQVGCLLRLEPNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5800 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLSSVASTDVLATVLEEMPPFPERESSILAKIiKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPS PS ADLLGLRAAP PPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
E ADELLN KFVCKNNG VLFENQLLQ I GVKS E FRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG 1 1 QTKALQVGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITS I KINRVDPSESLS IRLVGGSETPLVHI 1 1 
QHI YRDGVI ARDGRLLPGD 1 1 LKVNGMD I SNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDS FHVI LNKSS PEE 
QLGI KLVRKVDEPGVF I FNVLDGGVAYRHGQLEENDRVLAINGH 
DLRYGS PESAAHL I QASERRVHLWSRQVRQRSPDI FQEAGWNS 
NGSWSPGPGERSNTPKPLHPTITOIEKVVNIQKDPGESLGMTVA 
GGASHRE WDLP I YVI S VE PGG V I SRDGR I KTGD I LLNVDGVELT 
EVSRSEAVALLiKRTSSS I VLKALEVKEYEPQEDCSS PAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCGDrLLAVNGRSTSG 
M IHACLARLLKELKGR I TLT I VS WPGTFL 


5802 


3 


290 


CFSLYQIhffiRIMDLPTLLRHAFREMFSVGGLFWMFRIRIILCLM 
GAFFYLISPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 
ITQRLTR 


5803 


2234 

* 


1299 


EAOFGTTAE I YAYREEODFGIE I VKVKAIGRORFKVLBLRTOSD 
GIQQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
S YKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRI KKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQR 
LRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKKDMSPQKFWGLTRSALLPTIPDTEDEISPDKVI 
LCL ' 


5804 


2 

• 


1707 


EME KQRQEEQRKRTEEERKRR I EQDMLEKRKI QRELAKRAEQ I E 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
REEKER IKYEE DKR I RYEEQRPS LKEAKCLS LVMDDE I ES EAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDEENQDTAKI FKGYRPGKLKLSFEEMERQRRBDEKR 
KAEEEARRRI E EEKKAFAEARRNM WDDDSPEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKpEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQFEQREI 
DAALQKKREEEEEEEGS IMNGSTAEDEEQTRSGAPWFKKPLKNT 
SWDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 

• 


776 


Y I S DTLGQVYKS KIRWW 1 E ENGGNGN I S VDDL I ALLDLAEHAS S 
AFKESQQQSBDREYEVKERLYPKSKRRYDTYN IAG YQGE I EVGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYKLA 
LKNYIPYLTKLKFSLKKSFDFFDEYFVLLKPRNNIKQNEEAKTR 
RKVAG YFKKYVD X FCLLEESQNNTGLGS KFS E PLQ VERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANL YS LHS W LG I TT VFL FACQRFLG FAVFLLP W 
ASMWLRS LLKP I HVFFGAAI LSLS I AS VISG INEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRP 


5807 


2267 


13 02 


RFS KKTFRRP MAVD IQ P AC LGL YCG KTLLFKNG S TE I YGE CG VC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFIEW 
YSG KKS S SAL FQH I TAL FE CSMAAI I TLLVS D P VG VL Y IRS CRV 
LMLSDWYTML YNPS PDYVTTVHCTHEAVYPL YTI VF I YYAFCLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
ammo acxd 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
glutamic agio, r =fnenyi alanine, vj=uxycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
o=betmSr isinreomne, v=vaj.ine, 
W=Tryptophan / Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLLYYAFPYI ILVLSLVTLAVYMSASEIENCYDLLVRKKRLIVL 
FSHWLIJIAYGI IS ISRVDKLEQDLPLLALVPTPALFYLFTAKFT 
EPSRILSEGANGH 


5808 


2 


433 


S LPDSGWE YLSNGGVADNHKD FGE LR YNE CLMNFS CNGKNG S S 
EGRITHGFQLKSAYENNLMPYTNYTFDFKGVIDYI FYS KTHMNV 
LGVLGPLDPQWLVENWITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHL PNRR 


5809 | 


464 


2422 


I LVPGFQG I LHPGVYCALQSQHQAQELVAD IDECEVSGLCRHGG 
RCVNTHGS FE CYCMDGYLPRNGPEPFHPTTDATSCTE I DCGTPP 
EVPDGYIIGNYTSSLGSQVRYACREGFFSVPEDTVSSCTGLGTW 
ESPKLHCQE INCGNPPEMRHAILVGNHS SRLGGVARYVCQEGFE 
S PGGKITS VCTEKGTWRESTLTCTE ILTKINDVSLFNDTCVRWQ 
INSRRINPKI S YVIS I KGQRLDPMES VREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 

n t* nirrvn/*iT tr T \TnnpnwiTnnTTftTMVAcwfltn' r*T*VTT UMPOtfUTODM 

S IFNETCLKLNRRSRKVG5EHMYQFTVLGQRW YLiANi 1 £»HAT5FN 
FTTREQVPWCLDLYP TTDYTVWVTLLRS P KRHS VQ IT I ATPP A 
VKQTISNI SGFNETCLRWRS I KTADMEEMYLFHIWGQRWYQKEF 
AQE MTFN I S S S S RD P EVCLDLRPGTNYNVS LRALS S EL P WI S L 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LPLALQSTFSCDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 
E I P IGDRL Y YGEYYNAPLKRGSDYCI ILRITSEWNKVRRHSCAV 
WAQVKDS SLMLLQMAGVGLGS LAWI ILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAI SAVKVQIAEFLENLQEKSLRIEAFVS 
EIES FFNTI EENCSKNEKRLEEQNESMMKKVliAQYDEKAQSFEE 
VKKKKME FLHEQMVHFLQSMDTAKDTLETI VREAEELDEAVFLT 
SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQPPRLEPQEPNSATSTT IAVYWSMNKEDVIDS FQVYCME 

mm* mm*. A mm* mm*, mm* mm* mt ■mm* m~ m mm mm | ■ ▼ fill TTTT^ i"l If "Y" T"1^^^^T ^^^^^^Itt ^l^^^VT TYaYT YfkM TV T TXT 

E PQDDQEVNELVEEYRLTVKES YCI FEDLEPDRCYQVWVMAvNF 
TGCSLPSERAI FRTAPSTPVIRAEDCTVCWNTAT I RWRPTTPEA 
TETYTLE YCRQHSPEGEGLRS FSG I KGLQLKVNLQPNDNYFFYV 
RAIKAFGTSEQSEAALISTRGTRFLLLRETAHPALHISSSGTVI 
SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGETSWYMHCS EPQRYTFFYSGI VSDVHVTERPAR 
VG1LLDYNNQRLI FINAESEQLLFI IRHRFNEGVHPAFALEKPG 
KCTLHLGIBPPDSVRHK 


5811 


1918 


851 


AAALADPLPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WD I EGAVRRYV Q P FLNALGAAGNFS V DS Q I LYYAMLG VNPRFDS 
ASSSYYLDMHSLPHVINPVESRLGSSAASLYPVLNFLLYVPELA 
HS PLYI QDKDGAPVATNAFHSPRWGG IMVYNVDS KTYNAS VLPV 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARSVENLATATTTLTSLAQLLGKISNIVIKDDVASE 
VYKAVAAVQKSAEEIASGHLASAFVASQEAVTSSELAFFDPSLL 
HLL YFPDDQKFAI Y I P L F L PMAVP I LLS LVK I FLETRKS WRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREEEVEPGTARPPPAASAMDASLEKIADPT 
LAEMGKNLKEAVKMIiEDSQRRTEEENGKKLISGDIPGPLQGSGQ 
DMVSHjQbVQNIiMHGDEDEEPQSPRIQNIGEQGHMALLGHSLGA 
YI STIiDKEKLRKLTTR I LSDTTLWLCRI FRYENGCAYFHEEERE 
GLAKI CRLAIHSRYEDFWDGFNVLYNKKPVI YLSAAARPGLGQ 
YLCNQLGL.PFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RL PLLLVANAGTAAVGHTDKIGRLKELCEQYG I WLHVEGVNLAT 
I^GYVSSSVLA?UUCCDSMTMTPGPWLGLPAVPAVTIjYKHDDPA 

QRLQESLKKVNYI KI LVEDELS SPWVFRFFQELPGSDPVFKAV 
PVPNMTPSGVGRERHSCDALNRWLGEQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVLCCTLQIjR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVET I AATARE I EDNS RLLENMTEWRKGIQEAQVELQKAS 
EERLLEEGVLRQI PWGS VIiNWFS P VQALQKGRT FNLTAGS LES 
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SEQ 
ID 
NO: 


! Predicted 
beginning 
nucleotide 
location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 

J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
1 <A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=*Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q^Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
j \=possible nucleotide insertion) 








TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPFKRSLRGSDA 
LSETSSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGAD PGSRG 
LLRLLSFCVLIiAGLCRGNSVERKlYIPLNKTAPCVRLLNATHQI 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 
SNS YGPE PAHCRE IQWNSLGNGLAYEDFS PPI FLXEDENfiTKVI 
KQCYQDHNLS QNGSAPTFPLCAMQLFSHMAWLS FSTAT\ CMRRS 
S IQSTFS INPKI VCDPLSDYNVWSMLKPINTTGTLKPDDRVWA 
ATRLDSRSFFWNV\APGAES AVAS FVTQLAAAEALQKAPDVTTL 
PRlWlFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEIi 
G QVALRTS L ELWMHTD P VS QKNE S VRNQVEDLLATLE KSGAGVP 
AVI LRRPNQS QPLPPS SLQRFLRARN I SGWLADHSGAFHNKYY 
QSIYDTAENINVSYPEWLEPLKE/ETWNFG*QDTAKALADVATV 
LGRALYELAGGTNFSDTVQADPQTVTRLLYG\ FLIKANNS WFQS 
ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 
IASKELELITLTVGFGILIFSLIVTYCINAKADVLFIAPREPGA 
[ VSY 


5814 


8500 

• 


432 


ALKCRPRRVLAILVGPVQPDRMAEEGAVAVCVRVRPLNSREESL 

GETAQ V YWKTHNNVI Y P VDG S KS FNFDRVLHGNE TPKNVYEA\ I 

AAP 1 1 DS AIQG YNGTI FA\ YGQT \ ASGKTYTMMGSEDHLGV I PQ 

GQFHGHFSQKI*EVFLDREFLLRVSYMEIYNETITDLLCGTQKM 

KPL 1 1 REDVNRNVYVADLTEEWYTSEMALKW I TKGEKS RH YGE 

TKMNQRSSRSHTIFRMILESREKGEPSNCEGSVKVSHLNLVDIjA 

GSERAAQTGAAGVRLKEGCNINRSLFILGQVIKKLSDGQVGGFI 

NYRDSKLTRILQNSLGGNPKTRIICTITPVSFDETLTALQFAST 

AKYMKNTP YVNEVSTDEALLKRYRKE IMDLKKQLEEVSLETRAQ 

AMEKDQLAQLLEEKDLLQKVQNEKI ENLTRMLVTSS SLTIiQQEL 

KAKRKRRVTWCLGKINKMKNSNYADQFN I PTNITTKTHKLS INL 

LREIDESVCSESDVFSNTLDTLSEIBWNPATKLLNQENIESELN 

SEjRADYDNLVLDYEQLRTEKEEMELKLKEKNDIiDEFEALERKTK 

KDQEMQLIHEISNLKNLVKHREVYNQDIiENELSSKVELLREKED 

QIKKLQEYIDSQJdiENIKMDLSYSLESIEDPKQMKQTTjFDAETV 

ALDAKRESAFLRSENLELKEKMKELATTYKQMENDIQLYQSQIjE ! 

AKKKMQVDLEKELQSAFNEITKItTSLIDGKVPKDLLCNLELEGK 

ITDLQKELNKEVEENEALREEVILLSELKSLPSEVERLRKEIQD 

KSEELHIITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 

SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQE I VNLS KE 

AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 

NRDSPLQTVEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 

QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 

KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 

TADVKDNE 1 1 EQQRKI FSL IQEKNELQQMLES VI AEKEQLKTDL 

KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 

TCDRLAEVEEKLKEKSQQLQEKQQQLLNVQEEMSBMQKKINEIE 

NLKtreLKNKELTLEHMETERLELAQKLNENYEEVKS ITKERKVL 

KELQKSPETERDHLRGYIREIEATGLQTKEELKIAHIHLKEHQE 

TIDELRRSVSEKTAQI INTQDLEKSHTKLQEEIPVLHEEQELLP 1 

NVKKVSETQETMNELELLTEQSTTKDSTTLARIEMERLRLNEKF 

QESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 

SQS KQEQS LNMKEKDNETTKI VS EME QFKPKDSALLR I E I EMLG 

LSKRIiQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKENlKEIV 

AKHLETE E ELKVAH CCLKEQEET INELRVNLS E KETE I S T IQKQ 

LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 

KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKEEMKRVQEA 

LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 

I EHLKEQFETQKLNLEN I ETENI RLTQI LHENLEEMRSVTKERD 

DLRSVEETLKVERDQLKENLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end j 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

tit-ami r« 7V/->-?r1 TT— DViorwrl al r»r» H no CZ— fll vni np 

H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLEHSNDALKAQDLKIQEELR 
IAHMHLKEQQETIDKLR.GIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQIKDQSLTLSK 
LEIENLNIAQKLHENLEEMKSVMKERDNLRRVEETLKIjERDQLK 
ESLQETKARDLEIQQELKTARMLSKEHKETVDKLREKISEKTIQ 
ISDIQKDLDKSKDELQKKIQELQKKE^QLLRVKEDVNMSHKKIN 
EMEQLKKQ FE PNYLCKCEMDNFQLTKKLHE SLEE I R I VAKERDE 
LRRIKESLKMERDQFIATLREMIARDRQNHQVKPEKRLLSDGQQ 
HLME SLRE KCSRI KE LLKRYS EMDDHYE CLNRLSLDLEKEI E FH 
RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVEKQKELLIK 
IQHLQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFPSIK 
TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KNYQTLKTSLASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELE 

LEKTKET I QVLQDKVALGAKP YKEE I EDLKM KLGKI DLE KMKNA 
KEFEKEISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
EQLIKQKNELLSNNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
S PKVTGTAS KKKQ ITPS QCKERNLQD PVP KES PKSCFFDSRSKS 
LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 

DVP\ECKTQ 


5315 


23 

» 


1460 


SELVMWTVQNRESLGLLS FPVM ITMVCCAHSTNEPSNMS YVKET 

VDRLLKGYD I RLRPD FGG P P VD VGMR I DVAS IDMVSEWMDYTL 

TMYFQQSWKDKRLSYSGI PLNLTLDNRVADQLWVPDTYFLNDKK 

SFVHGVTVKNRMXRljHPIXiTvk 1 iAAUJyiMLUjXKXi'JuUc 

QNCTLE IES YGYTTDDI E F YWNGGEGAVTGVNKI ELPQ FS I VDY 

KMVS KKVE FTTGAYPRLS LS FRLKRNI G YF I LQT YMPSTL I TI L 

SW VS FWINYDAS AARVALG I TTVLTMTT I STHLRETLP KI P YVK 

AI D I YLMGC FVFVFLALIjE YAFVNYI FFGKGPQKKGAS KQDQS A 

NEKNKLEMNKVQVDAHGNILLSTLEIRNETSGSEVLTSVSDPKA 

TMvcvneiicTrtVDtnjT cgdp\ ADncwfT\rDQWSPTPPP2lQ\ 
TM Y5 X JJbAo lyi KlUr Li o o Kr» \i\ uKAr Urcnva v ko iujk ± kkjo-vo \ 

QLKVKI PDLTDVNS IDKWSRMFFP I T FSLFNWYWL YYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLQQQIiFSQEAADEICKRLAPDSRL 
NPHRSLLGTGNYDVNVIMAALQGLGIiAAVWWDRRRPLSQLALPQ 
VLGLILNLPSPVSLGI^SLPLRRPJHLRWPCARIi/VTVSYYNLDS 

SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 

VMSNTTVPNAPQANSDSMVGYVLGPFFLITLVGVVVAVVMYVQK 

XKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGD?KW\QAG 

RVATSTSGCHCWMSRRDLTPLPHPSEPGVI1DCLGPCHLI1PLLSP 

GSPCWVLGXjHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA j 

HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 


3318 


QALRDKLWIFLVQSFYAVRHTESWKLMSTDDQQKIQAAAFDKGD 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSWRGNNKKECWSYLST 
NKKMKSDGLGASGHS S STNRNS INKTLKQDDVKEKDGTKI AS KI 
TKELKTGGKNVSGKPKTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQISGARPKVLTGNLNVQAKAK 
PLKKATGKDS PCLS I AGPSSRSTDSSMEFS I STECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNS VDS VKNSTVAI KSRPVSRVT 
NGTSNKKS IHEQDTNVNNSVLKKVSGKGCSEPVPQAI LKKRGTS 
wnrTAAOORTKSTPSNLTKTOGSOGES PNSVKSSVS SRQSDENV 
AKLDHNTTTEKQAPKRKMVKQVHTALPKVNAKIVAMPKNLNQSK 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNNNKDSVSEQKPHKPLINLASEISDAEALQSSCRP\DPQK 
PLNDQEKEKLALECQNISKLDKSLKHELESKQICLDKSETKFPN 
HKETDDCDAANICCHSVGSDNVNSKFYSTTALKYMVSNPNENSL 
NSNPVCDLDSTSAGQIHLISDRENQVGRKDTNKQSS I KCVEDVS 
LCNP ERTNG TLNSAQE DKKS KVP VEGLT IPS KLSDES AMDEDKH 
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SEQ 
ID 
NO: 

■ 


Predicted 
beginning 
nucleotide 

JLOCEluLOn 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

j sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Eo 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidxne, I^Isoleucxne, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMETSESPESHETPETPFVGH 
WNLSTGVLHQRESPESDTGSATTSSDDIKPRSEDYDAGGSQDDD 
GSNDRG I S KCGTMLCHDFLGRS S S DTSTPEE tiKI YDSNLRIE VK 
MKKQSSNDLFQVNS TS DDE I PRKR PEI WSRS AI VHSRERENI PR 
GSVQFAQE I DQVSSSADETEDERSEAENVAENFS ISNPAPQQFQ 
GI INLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 1 
CKQNKGNSVCKNESTVLDLSS IDS SRKNKQS VSATEKKNTIDVL 
SSRSRQLLREDKKVNWGSNVENDIQQRSKFLDSDVKSQERPCHIi 
DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
AN IALSAGD I DDCDTLAQTRM YDHR ? SKTLSPI YEMDVI EAFEQ 
KVESETHVTDMDF*DDQHFAKQDWTLLKQLLSEQDSNLDVTOSV 
PEDLSLAQYLINQTLLliARDSSKPQGITHIDTLNRWSELTSPLD 
| SSASITMASFSSEDCSPQGEWTILELETQH 


5819 


1 


1 5557 

* 


AAAGLLGALHLVMTLVVAAARAEKEAFVQSES I IEVLRFDDGGL 

LQTETTLGLS S YQQKS I S LYRGNCR P I RFE P PMLDFHEQPVGMP 

KMEKVYLHNPSSE*TITLVSIFATTSHFHASFFQNRKILPGGNT 

S FDVS / VFLARVVGNVENTLFINTSNHGVFTY \QVFGVGVPNP Y ' 

RLRPFLGARVTVNSS FS P I INIHNPHSEPLQWEMYSSGGDLHL 

ELPTGQQGGTRKLWEIPPYCTKGVMRASFSSREADNHTAFIRIK 

TNASDSTEFI ILPVEVEVTTAPGI YSSTEMLDFGTLRTQDLPKV 

LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITLKAS\ESK 

YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKLEIPYQAEV 

LDGYLGFDHAATLFHIRDSPADPVERPIYLTNTFSFAILIHDVL 

LPEEAKTMFKVHNFSKPVLILPNESGYIFTIjLFMPSTSSMHIDN 

NILIjITNASKFHLPVRVYTGFLDYFVLPPKIEERFIDFGVLSAT 

EASNI LFAI INSNP I ELAI KS WH 1 1 GDG \ LS I ELVAVDRGNRTT 

IISSIiPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 

DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHVVLPPSFPGKI 

VHQSLNIMNSFSQKVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 

KKSKIANIYFDPGLQCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 

WDADWDLHQSLFKGWTGI KENSGHRLSAI FEVNTDLQKNI ISKI 

TAELSWPSILSSPRHLKFPLTNTNCSS\EEEITLENP/SQDVPV 

YVQFI PLALYSNPSVFVDKLVSRFNLSKVAKIDLRTLEFQVFRN 

SAHPLQS STGFMEG \LS PHLI LNL I LKPGEKKS VKVK\FTPVHN 

RTVSSLIIVRNNLTVMDAVMVQGQGTTENLRVAGKIiPGPGSSLR 

FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 

rSGYSCEGYGFKWNCQEFTLSANASRDIIILFTPDFTASRVIR 

ELKFITTSGSEFVFILNASLPYHMLATCAEALPRPNWELALYI I 

ISGIMSALFLLVIGTA\ YLEAQGI WEP \ FRRRLS \ FEASNPPFD 

VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 

GSHKQ * GPSGHPHS SHSNRNSADVDDVRAYNSGRTS SMTSAQAA 

SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 

PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 

KPLQRKVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 

TSNPDTEPLLKEDTEKQKGKQAMPEKHESEMSQVKQKSKKLLNI 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 

SRNAQKTKGTSKLVDNRPPAIAKFLPNSQEIiGNTSSSEGEKDS P 

PPEWDS VP VHKPGS S TDSL YKLSLQTLNAD I FLKQRQTS PTPAS 

PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 

SLPGKNGNPTFAAVTAGYDKS PGGNGFAKVS SNKTGFSSSLGI S 

HAP VDSDGSDSSGLWS P VSNPSS PDFTPLNS FSAFGNS FNLTGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

TLAS IGLMGTENS PAPHAPSTS S PADDLGQT YNPWR I WS PTIGR 
RSSDPWSNSHFPHEN 


5820 


310 


1270 j 


RVS LSGP VS LGVLLCARSSTMGKRDNRVA YMNP I AMARS RGP I Q 
SSGPTIQ\ VI *IDQGLPGKK* KSN* KRKRK/DSKALAEFEEKMN 
ENWKKELEKHREKLLSGSESSS KKRQRKKKEKKKS W* \DS SSS \ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K=Iiysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryp to P n an, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDIKGLSKKRKMYSEDKPLSSESLS 
ES E Y I E EVRAKKKKS SEEREKATEKTKKKKKHKKHSKKKKKKAA 
SS SPDS P* H * EKSGFPYKESAMS EE I STVKTTTYLLKCMNFLVF 
GI I PGLFSSHSDATV 


5B21 


179 


915 


KWRNQSWRWPKPGTNWMIjSCSVCWRRVTWTGSVWMRKLGKHPQT 
PT/I KDCS I AATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
T YVI KLFDRS VDLAQFS ENTPL YP I CRAWMRNS PS WERE CSPS 
SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALRMQGTP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKI LREMYERQ 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMWTGGCRRI 

PVLVFHADAILTKDNNIRVIGERYHLSYKIVRTDSRLVRS I LTA 

HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 

LTRKDRLYKNIIRMQHTHGFKAFHILPQTFLLPAEYAEFCNSYS 

KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 

LLIDDFKFDVRLYVLVTSYDPLVIYLYEEGIiARFATVRYDQGAK 

NIRNQFMHLTNYSVNKKSGDYVSCDDPEVEDYGNKWSMSAMLRY 

LKQEGRDTTALMAHVEDLIIKTIISAEIiAIATACKTFVPHRSSC 

FELYGFDVLIDSTLKPWLLEVNLSPSLACDAPLDLKIKASMISD 

MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 

SDAEMKNLVGSAREKGPGKLGGSVLGLSMEE I KVLRRVKEENDR 

RGGFIRIFPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 

APELKI * S LNS KAKLHAALYERKLLSLEVRKRRRRS SRLRAMRP 

KYPVI TQPAEMNVKTETESE EEEE VALDNEDEEQEASQEESAGF 

LRENQAKYTPSLTALVENTP KENSMKVREWNNKGGHCCKLETQE 

LEPKFNLMQILQDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 

QTFSASWAAKEDEQMELVVRFLKRASNNLQHSLRMVLPSRRLAL 

LERTRILAHQLGDF 1 1 VYNKETEQMAEKKSKKKVEEEEEDGVNM 

ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 

NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 

EKEAKLVYSNSSSGPTATLQKIPNTHLSSVTTSDIjSPGPCHHSS 

LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGLP 

RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYIjNKHHS 

G I AKTQ KEGEDAS L YS KR YNQS MVTAELQRLAE KQAARQ YS P S S 

HINLLTQQVTNLNLATGI INRS S ASAP PTLRP 1 1 S P SGPTWSTQ 

SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 

kyhptagsyqlqfalqqleqqklqsrqlldqsrarhqaifgsqt 
lpnsnlwtmnngagcrissatasgqkpttlpqkwpppsscasl 
vpkpppnheqvlrratsqkaskgssabgqlnglqsslnpaafvp 
itsstdpahtkimnhkhtekqpvhhswvhd 


5823 


42 


2293 


LLT7^LSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSES P FAWS PLAGE KF VE VY KEAHLLALH I ES S SRNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLGP P VGEPRLLAS S PALPS S GAQARLTRAPGP PHSAHAL P 
RES CTAHAASQAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKK 
E I PAS PSRTKI PAE KESHRD VLP DKPAPGAVNVPAAGS HLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKS SEFAS I PAN * LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGPVG\ASSWQAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RS SGPAPQS LLS AWR VS ALPTPASRRCS GL P P MTP KTMP RAVGS 
PL\ CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR \ LVDVSPDR 
GS P PS RVPQALNFSPEESDSTFS KSTATEVAREEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS E S P FAWS PLAGEKFVE VYKE AHLLALH X E S S S RNQAAQAAKP 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



4210 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M«Methionine , N«Asparagine, 
P=Proline, QoGlutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



EDPRSQGVERFIQESKF\KINLFEKBKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
E I PAS PS RTKI PAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\NKLGLKKTIiLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSE FAS I PAN * LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGP VG\ASSWQAKRVDVS ELAAEQLTAPP \SAS PTQPQTPE 
GGG\QWLNSS CAWSESSQLNKTRS irrrdsclnsktkvmptptn 
QFKI PKFS IGDS\ PDS STPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\ CVPARRRSSEPRKNSAMRTEPTRESNRKTDS R\ LVDVS PDR 
GSPPSRVPQAIiNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLIDLPIjIDFCDTPEAHVAVGSE 
SRPLXDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 



194 



B71 



2287 



FLQIESASPAPFSSGFLAAHPHSPGGSLATKGRSRLSAPGMLHL 

SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEL 

SGAERERPRHFREFTVCS IGTANAVAGAVKYSESAGGFYYVESG 

KLFS VTRNRF I HWKTSGDTLE LME E S LD I NLLNNAI RL KFQNCS 

VLPGGVYVSETQNRVI I LMLTNQTVHRLLLPHPSRMYRS ELWD 

SQMQSIFTDIGKVDFTDPCNYQLIPAVPGISPNSTASTAWLSSD 

GEALFALPCASGGIFVLKLPPYDIPGMVSWELKQSSVMQRLLT 

GWM PTAIRGDQS PSDRPLSLAVHCVEHDAFI FALCQDHKLRMWS 

YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 

GIF\MHAPKRGQFCIFQLVSTESNRYSLDHISSLFTSQETLIDF 

ALTSTD I WALWHDAENQTVVKYI NFEHNVAGQWNPVFMQPLPEE 

EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 

DliSWSELKKEVTIAVENELQGSVTEYEFSQEEFRNLQQEFWCKF 

YACCLQYQEALSHPLALHLNPHTNWCIjLKKGYIiSFLI PSSLVD 

HLYLLP YENLLTEDETT ISDDVD IARDVICLI KCLRLI EES VTV 

DMSVIMEMSCYNLQSPEKAAEQILEDMITIDVENVMEDICSKLQ 

EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 

GSNTAGYI VCRGVHKI ASTRFLI CRDLLILQQLLMRLGDAVT WG 

TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 

LQHLSVLELTDSGALMANRFVSSPQTIVELFFQEVARKHI ISHL 

FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 

GNCQYVOLQDYIQLLHPWCQVOTGSCRFMLGRCYLVTGEGQKAL 

ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRLQYYDKVLRLLD 

VIGIiPELVIQLATSAITEASDDW\KSQATL\RTCIFKHHL\DLG 

\HNSQAYGSL*PQIPDSSRQLDCLRQLVWLCERSQLQDLVEFS 

YVNLHNE WGI IESRARAVDLMTHNYYELLYAFHI YRHNYRKAG 

TVMFEYGMRLGREVRTLRGLEKQGNCYIiAALNCLRLIRPEYAWI 

VQP VSGAVYDRPGAS PKRNHDGECTAAPTNRQI E ILELEDLEKE 

CSLARIRLTLAQHDPSAVAVAGSSSAEEMVTLLVQAGLFDTAIS 

LCQTFKLPLTPVFEGLAFKCIKIjQFGGEAAQAEAWAWLAANQLS 

SVITTKESSATDEAWRLLSTYLERYKVQNNLYHHCVINKLLSHG 

VPLPIWLINSYKKVDAAELLRLYLNYDLI*DLTPYQVIRICGC 



KSQLLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEQQRQLKKQKNR 
AAAQRSRQKHTDKADAIiHQQHESLEKDNIiALRKEIQSLQAEIiAW 
WSRTLHVHERLCPMDCASCSAPGIiLGCWDQAEGLLGPGPQGQHG 
CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAEP P VQLS PS PLLFASHTGSSLQGS S S KLS ALQPS LTAQTA 
PPQPLELEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPSPHPLLAFPLLSSAQVHF 



GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP* *HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLEVALETI^SAEVCAGIYDILLALIFLHDRGHLTHNNVCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQS IRDPAS IPP 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
S FQQTLHSTLLNP I PKWRPALCTLLSHDFFRNDFLE WNFLKS L 
TLKS EEE KTE F FKFLLDRVS CLS E EL I ASRL VPLLLNQLVFAE P 
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ID 

NO: 



Predicted ~ ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine r D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



VAV\KSFLPYLLGPKKDHAQGETPCLLSPALFQSRVIPVLLQLF 
EVHEEHVRMVLLSHIEAYVGALSLREQLKKV\IL\PQVLLG\LR 
D\TS DS IVAI TLHSLAVLVSLLGPEWVGGERTKI FKRTAP\ S F 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNl\QIWP\REP\CDDVKSQCTTLDV 
BE SS WDD CE PS S LDTKVNPGGG I TATKP VTSGEQKP I PALLS LT 
EESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSEL 
GLGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
EMVP KKDDVS P VMQFS S KFAAAE ITEGEAEG WEEEG ELNWEDNN 
W 



5828 



257 



5829 



260 



1259 



AREGGSLGAVAACGELSYSCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDLTSS I PKPLLPVGNKPLI WYPLNLLERVGFEEV 
I WTTRDVQKALCAE FKMKMKPDI VCI PDDADMGTADSLRYI YP 
KLKTDVLVLS CDLITDVALHEWDLFRAYDASLAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GSILQKHPRIRFHTGLVDAHLYCLKKYIVDFLMENG\SITSIRS 
EL\IPYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
S FY * KEANYTGTGAP Y\D \ ACW I 



PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNC IS FHPSGN 
YLITA3SDGTLKILDLLKGRLIYTLQGHTGPVFTVSFSKGGEIiF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDS PPHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR*SICRSLLPLLWISF 
LLILPQQQKPWGLCQTRVKRPVDIS *TLP * CHQNVCQQPRKRK 
QKT*VTSPVKVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 



5830 



4496 



3139 



GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
N IEAAVQDRLNEQEGVPSVFNPP P SRPLQYNTADHRI YSYWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TD P VGD I VS FMHS FEE KYGRAHPVFYQGTYSQALNDAKRE LRFL 
LVYLHGDDHQDSDEFCRNTIiCAPEVISLINTRMLFWACSTNKPE 
GYRVS QALRENTYPFLAM IMLKDRRE * PV\VGRLEGLI \QPDDL 
INQLTFIMDANQTYLVSERLEREERNQTQVLRQQQDEAYLASLR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLSHTEVLFVQDLTDE 



5831 



71 



2897 



5832 



2454 



329 



FCS KDKCCLYLPDS INRS KS CTAKPGAHSQDRHAVMDSERQVKD 
TDDI ESP KRS I RDSGYIDCWDSERSDS LSPPRHGRDDSFDSLDS 
FGSRS RQTPS PDWLRGS SDGRGSDS ESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWS TATSPAGLGKKALQDYGPRT \ PVS \DDAESTSMFDMRCEE 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLI KKEEERKKMEKLLAGEDGTSERRKS IKTYRE I VQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSS FLNDPNPMKYLRQQSLPPPKFTATVETT IARAS 
VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVSVNGETVHREEEKERECPTVAPAHSLTKSQMFEGVARVH 
GS PLELKQDNGS I E INI KKPNSVPQELAATTEKTEPNSQEDKND 
GGKS RKGNIELAS SEPQHFTTTVTRCS PTVAFVEFP SSPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YEEEP+ 1 I\EDPWPFTVSSSSADQLSTSSSMTEGSGTMNKIDL 
GNCQDE KQDRRWKKS FQGDD SDLLLKTRE SDRLE EKGS LTEGAL 
AHSGNPVSKGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSEDVKPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMI IETLNLYFHIQCFRCG\ ICKGQLGDAVSGTDVRIR 

NGLL NCNDCYMRSRSAGQPTTL 

PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN 
S ENLE KLE KLGMS S DLVSRLPT I YRNAHP I KNKS S APSRVP PLF 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, £= 
Glutamic Acid, Fsphenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
Pa Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W«Tryptophan, Y*»Tyrosine, X^Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDSSGSVSPNTLSQEEGDQICIiYHIRKSCSFQDKCH"" 

RVHFHLPYRWQFLDRGKWEDLDNMELIEEAYCNPKIERILCSES 

ASTFHSHCLNFNAMT YGATQARRLSTAS SVTKP PHF I LTTDWI W 

YWSDEFGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV+R 

PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN*KGLPQTQIR\AP 

QDVTTMQTCNTKFPGPKSIPDYWDSSALPDPGFQKITLSSSSEE 

YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 

GKAVDERQLFHGTSAI FVDAICQQNFDWRVCGVHGTS YGKGS YF 

ARDAAYSHH YS KS DTQTHTMFIiAR VLVGE FVRGNAS FVRP PAKE 

GWSNAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIQYTTSSKPSV 

TPS ILLALGSLFSSRQ 


5833 


170 


3289 


SILCIiLSPCWQFGKPWSILSSRSRHSPCTKKGWEGMRKHLHT 
RQGHK* VHVE I S KALWVYRDDYFI RHS ISVSAVIVRAWITHKYR 
GRDWNVKWEENLLHAVAKNYTLLQTIPPFERPFKDHQVCLEWNM 
GYIWNLRANRIPQCPLENDWALLGFPYASSGENTGIVKKFPRF 
RITOELEATRRQRMDYPVFTVSLWLYIiLHYCKANLCGILYFVDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
D IS FNGGQ I WTTS IGQDLKS YHNQTI SFREDFH YNDTAGYFI I 
GGSRYVAG I EG FFGPLKYYRLRSLHPAQI FNPLIiEKQLAEQ I KL 
YYERCAEVQE I VSVYASAAKHGGERQEACHLHNS YLDLQRRYGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLIiTVPRNQNESVSEIG 
GKIFEKAVKRLSSIDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLNVPRDQLQGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELS YAY YSNIATKTPLDQHTLQGDQAYVETI RLKDDEI L 
KVQTKEDGDVFMWLKHEATRGNAAAQQRLAQMLFWGQQGVAKNP 
EAAIEWYAKGALETEDPALI YDYAI VLFKGQGVKKNRRLALELM 
KKAASKGLHQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\MGN 
PDASYNLGVL.HLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTL 
WCS LY Y I TGNLET FPRD P EKAWWAKHVAEKNG YLGHVI RKGLN 
AYLEGSWHEALLYYVLAAETGIEVSQTNLAHICEERPDLARRYL 
GVNCVWRYYNFSVFQIDAPSFAYLKMGDLYYYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLALLI EEGTI I PHHI LDFLEI DSTLH 
SNNIS I LQELYERCWSHSNEESFS PCSLAWLYLHLRLLWGAILH 
SALIYFLGTFLLSILIAWTVQYFQSVSASDPPPRPSQASPDTAT 
STAS PAVTPAADAS DQDQ PTVTNNPE PRG 


5834 


17 


4020 


RFRRGGGRVF PGAF PAS PS D S LGQGNS QGP FRT P KP PRT / QE CG 
SAAPGP IPGQSSS * VPLRLEQIQQKADCPLSLELALKPRMAAQV 
TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTGIARYIEQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 
PQVKCNEQPNRVEI YEKTVEVLEPEVTKLMNFMYFQRNAIERFC 
GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 
KNDHSAYKRAAQFLRKMADPQSIQESQNLSMFLANHNKITQSLQ 
QQLEVISGYEELLADIVNLCVDYYENRMYLTPS EKHMLLKVMGF 
GLYLMDGSVSNIYKIiDAKKRINLSKIDKYFKQLQWPLFGDMQI 
ELARY I KTSAHYEENKS RWTCTSSGS S PQYNICEQM IQI REDHM 
RFISELARYSNSEWTGSGRQEAQKTDAEYRKLFDLALQGLQLL 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVIAMI KGLQVLMGRMES VFNHAIRHTVYAALQD FSQ 
VTLMEPLRQAI KKKKNVI QSVLQAJRKTVCDWETGHEPFND PAL 
RGEKDPKSG*DIKVPRRAVGPSSTQLYMVRTMLESLIADKSGSK 
KTLRS SLSGPT I LD I EKFHRES FF YTHL INFSETLQQCCDLSQL 
WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVL 
YSLDL YNDSAHYALTRFNKQFLYDE IEAEVNLCFDQFVYKLADQ 
IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 
HVQLLGRS IDLNRLITQRVSAAMYKSLELAIGRFESEDLTS IVE 
LDGLLEINRMTHKLLSRYLTLDGFDAMFREANHNVSAPYGR ITL 
HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 
QYLHGSKALNLAYSS I YGS YRNFVGPPHFQVICRLLGYQGIAW 
MEELLKVVKSLLQGTILQYVKTLMEVMPKICRIjPRHEYGSPGIL 
E F FHHQLKD I VE YAE L KT VCFQNIiRE VGNAI LFCLL I EQS LS LE 
EVCDLLHAAPFQNILPRVHVKEGERLDAKMKRLESKYAPLHLVP 
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ID 
j NO; 


Predicted 
beginning 
nucleotide i 
location 
corresponding 
to first 
amino acid 
residue or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acict 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

Plnhami 7Vr»irl J?— PVlPnvl alani TIP GssGlVCiriS- 

H=Histidine, I=:Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 

W— Trvnl-nnhan Y=Tvrof>ine X=UnkllOwn» *=StOD 

Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERLGTPQQIAIAREGDLLTKERLCCGLSMFEVILTRIRSFLD 
DPlWRGPLPSNGVMHVDECVEraRLWSAMQFVYCIPVGTHEFTV 
EQCFGDGLHWAGCMI IVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
E 1 1 KNVPLKKMVER IRKFQ I IjNDE 1 1 T I LDKYLKSGDGEGTP VE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGNIR^QGSHQIDFQVLHDI^QKFPEVPEVWSRC^ 
DACCAVL S QE S TRY L YGEGD LNFSDDSG I SGLRNHMTSLNLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSS SGASNSAPHLGFHLGS KGTSSLSQQT 
PRFNP IMVTLAPN I QTGRNTPTSIxHIHGVPPPVLNS PQGNS I YI 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDEIMSRSQPIO/YISA 
NAATGDEQVMRNQPTLF I S TNSGASAASRNMSGQVSMGPAF IHH 

iTnnT/PD7\ Tr»MTao7\TPC!D'D1Af\7 r rAT3M r r\ TrVTW TTVQPWTCPPAVSP 

GWS PTFELTNLLraPDITYVETENIHHLTDPTLAHVDRISETRK 
LSMGSDDAAYTQDr*RISNSWLGMVAHAOTSSALGGQDGRII*A 
QEFETSWGNIWRIjRLYRRF*NYAGMVAHTCSPSYSVD*ALLVHQ 
KARMERLQRELBIQKKKLDKLKSEVNEMENNLTRRRLKRSNS IS 
QIPSLEEMQQLRSCNRQLQIDIDCLTKEIDLFQARGPHFNPSAI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 

4 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 

SDVNYQCLFSAHVLHLRGVLTTQPVEDERGIWFLWNGEIFSGIK 

VEAEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 

HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGIiANQWQE 

VPAS\DFSELILSLLSFPDALFraCILGNIFLGRILLKKMLIA* 

VKFQQTYQHLYQR*QMKPNCILKNLLFL*I*CCHKLHWRLIAVI 

FPMCHLQERYFKSFLLMYT*KEVIQQFIDVLSVAVKKRVLCLPR 

DENLTANfc VLiKTC-DKKAxN v AJL xjxt oLrtjlxJoPiv jjm^*wxxt jji^a 

P IDLLNVAF IAEEKTMPTTFNREGNKQKNKCEI PSEEFS KDVAA 

AAADS PNKHVSVPDR I TGRAGLKELQAVSPSRIWNFVE INVSME 

E LQKLRRTR 1 CHL 1 RPLDTVLDDS IGCAVWFASRG I GWLVAQEG 

VKSYQSNAKVVLTGIGADEQLAGYSRHRVRFQSHGLEGliNKEIM 

MELGRISSRNLGRDDRVIGDHGKEARFPFLDENWSFLNSLPIW 

EKANLTLPRGIGEKIiLLRIxAAVEIXLTASAIiIjPKRAMQFGSRIA 

I^EKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 


NGNAVAQAPVTNCCYIATGSKDQTIRIWSCSRGRGVMILKLPFL 
KRRGGGIDPTVKERLWLTLHWPSNQPTQLVSSCFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSLAIGVGDG 
M IRVWNTLS IK1WYDVKNFWQGVKSKVTALCWHPTKEGCLAFGT 
DDGKyGLYDTYSNKPPQISSTYHIOCTVYTl^WGPPVPPMSLGGE 
GDRPSLAL Y S CGGEG I VLQHNP WKLSGEAFD INKLI RDTNS I KY 
KLPVHTEISWKADGKIMALGNEDGSIEIFQ\IPNLKLICTIQQH 

HKLVNTIS WHHE \HGS PAQKLS YL \MPSGSQQCS PFTCHNLKNC 
P*KAAPESPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH*WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIELEKKRLSQ 
1 PKAKPKKKKKPTxjRTPVKLES IDGNEEESMKENSGPVENGVSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTXNNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKBELHQDCL 
VIxATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDIEGKG 
HLENGHPELFHQLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLS IHKVYEAVELLKS 
NHFYREAIAIAKT^RLRPEDPVLKDLYLSWGTVLERDGHYAVAAK 

CYLGATCAYDAAKVLAKKGDAAS LRTAAE LAAI VGE DELS AS LA 
LRCAQELLIxANNWVGAQEALQLHESLQGQRLVFCl^EIiLSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 
QRAFQKLQNIKYPSATNNTPAKQLLLH1CHDLTLAVLSQQMASW 
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SEQ 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, MaMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAVTOSYDSGSFTIMQEVYSAFLPDGCDHLRDKLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGE31M 
LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 
TANGPDKNEPEVEAEQPIiCSSQSQCKEEKNEPLSLPEIiTKRLTB 
ANQRMAKFPESIKAWPFPDVLECCLVLLLIRSHFPGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECIiDPAQRDLYRDVMLENYSNL 
IS LDLE SS CVTKKLS PEKE I YEME S \ PSGR I WGNVST I TFQYNG 
LG DNME CKGNLEGQVS KS EGL YMC VKI TCE EKATESHS TS S TFH 
RII/HYQGKIVKCKECRQGFSYLSCLIQHEENHNI*KCSEVNKH 
RNTFSKKPSYI*HQ\KFRLGEKPYECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCNACGKAFI RGSQL.TEHQRVHTGEKP YDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKP YE C KECGKAF I LGSHL TYHQRVHTGE KP Y I CKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQI/TYHLRVHSGERPY 
KCKECGKAFISNSNLIQHQRIHTGEKPYKCKECGKAFICGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYLTQHEKIHGEKHYECKEC 
GKTFVRATQLTYHQRIHTGEKPYKCKECDKAF/HLWLTILSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSIALASLDFAHLQEKNPEN 


5839 


1 

1 

« 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEBCLKD\SPRFRAAL 

EEVEGDVAELELKL\DKLVKLCIA\MIDTGKAFCVANKQFMNGI 

RD\LAQNS\NNDA\WETKFAPSFLDSLQEMINFHTIL/L*PNS 

E IN* GHS FQNFVKEDLRKFKDAKKQFENSQ * KRKKIALVKNAP V 

PSRPASLEb*KPPNILTATRKCFRHIALDYVLQINVLQSKRRSE 

ILKSMLSFMYAHLAFFHQGYDLFSELGPYMKDLGAQLDRLVGDA 

AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 

RASNAF KTWNRRW FS I QNNQ WYQ KKFKDNPTWVEDLRLCTVK 

HCEDI ERRFCFE WSPTKS CMLQADSEKLRQAWI KAVQTS I \AT 

AYRBBCDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 

CIPGNASCCDCGLADPRWAS INLGI TLCIECSGIHRSLGVHFSK 

VRSLTLDTWEPELLKLMCELGNDVINRVYEANVEKMGIKKPQPG 

QRQEKEAYIRAKYVERKFVDKIFL*SLSPP\EQQKK\FVSKSSE \ 

EKRLS ISKFGP\GDQVRASAQSSVRSNDSGIQQSSDDGRBSLPS 

TVSAN S Ij YE P EGE RQDS S M FLDS KHLNPGLQLYRAS YEKNLPKM 

AEALAHG ADVNWANS EENKATPL I QAVLGGS LVTCE FLLQNGAN 

VNQRDVQGRGPLHHATVLGHTGQVCLFLKRGANQHATDEEGKDP 

LSIAVEAANADIVTLIiRLARMNEEMRESEGLYGQPGDETYQDIF 

RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQISSPRWRS5QRAFMSALSKTQTQSAPALQ 

GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTIKGRNLPSS 

AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 

SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 

\EMKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 

RGPT S TS I DN I DGTP VRDE RS GTP TQDEMMDKPT S S S VDTMS LLi 

SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 

SPYKQPSDGMERPSSLMDSSQBKFYPDTSFQEDEDYRDFEYSGP 

PPSAMMNLQKKPAKSILKSSKIiSDTTEYQPILSSYSHRAQEFGV 

KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP | 

SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 

h FS PQNTLAAPTGHP PTSG VE KVLAST I S TTST I E F KNMLKNAS 

RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQBEHY 

RIETRVSSSCLDLPDSTEBKGAPIETLGYHSASNRRMSGEPIQT 

VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 

SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 

LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 

VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 

SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 

KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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SEQ 
ID 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 

^niirio acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

O'JLUL.ctiui^ JL u. ; r — iriicxiy ictxcinillc , u-vsiyciilc f 

H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline 0=nluhan\ine. RoA-rm" ni np 
S ^Serine , T^Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHSLEHIiGPPHGGGGGGGSNSSSGPPLGPSHRDTISRSGII 
LRS PRP D FRPRE P FLSRDPFHS LKRPR P P FARGP P FFAP KR P FF 
PPRY 


5841 


1908 


762 


GLRL FLVLTVW PMMKP S WLSRTE FS KRLLCRTLWCQ S GW S SRS Y 

TTJQMT.tfMTTCTMDDCDTOTVCTD'PC JiDDPT •PHTT/eTPT PncnifJ 

IKoImJjWtI lolDJKKbKloirub 1K1 oAKFtjJj 1A1 VoUjijibDof 1W 

RHCWMTARS CS GE KGGHWAPRQVG VYIiLPGRVGCVS S RVS PS FP 
GDGLDSGLARRGSAVS ALASGL VE E PMLGP P FHPTPRFKAVSAK 
SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\VEPMCKESDHIHI IALAQGLQRVHPGWEYMGPRPRAATTNPHI 
FP*GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHC YRG FS VWKWS YFTP FFLSHD P P PM F Y 


5842 


307 


1918 


QEP TAD FKLRS TCGCGREMTCPD KPGQL INW F I CS LC VPRVRKL 
WS SRRP RTRRNLLLGTACAI YLG FLVSQVGRASLQHGQAAEKGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWYITLRSK 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLML * HLGTLREQTWLRLESDPGGWCGVRE / WRAGGPDFLQPSS 

RLLVLEGGAPGAVLRCGPS P CGLLKQ P LDMS E VFAFHLDR I LGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLUCQKCWQNGRVPKPESGCTEIHHHEWSKMALFDFLLQI 
YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHI IQRKH 
DPRHL VF I DNKG F FDRS EDNLNF KLLEG I KE FPAS AVYVLKS QH 
LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 








\j J. AKLi V 1 L. W V lirlVjy* V AW Ei r\j V V WL» * y * Kt-K i* K.vj WG IjGAoM 
RGSRMSQPPQCLRRAQSSCCHFMVKLLDDGTFMI PGEKVAHTSL 
DALVTFHQQKP IEPRRBLLTQPCRQKDPANVDYEDLFLYSNAVA 
EEAACPVSAPEEASPKPVLCHQSKERKPSAEM/RQNNHQGSHFL 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCELWT 
LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 
LTAPGTKRQKGPHQEGREVGQLH+GDPRGQELAPNGSESPILPG 
VQARAPGLGRA 


5844 


202 

* 


2471 

• 


FDSAVLS S I NVMAVLPGPLQIiLG VLLT ISLSSIRLI QAGAY YG I 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPLAKDGLAMGKEMPHt* 
QYGKE YPHLPQYMKEIQPAPRMGKEAVPKKGKE I PLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPMGI P * PQGPPGPHGLPGIGK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 
PGLPGP PGL PG I GKPGFPG PKGDRGMGGVPGALGPRGEKGP IGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 

GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGPPGP PGP PGPPAVMPPTPPPQGEYLPDMGLGIDGVKPPHAYG 
AKKGKNGGPAYEMPAFTAELTAPFPPVGAPVKFNKLLYNGRQNY 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 
EYKKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASLQDKMANPKEKTAMCLVNELARFNRVQPQYKLLNER 
GPAHSKMFSVQLSLGEQTWESEGSSIKKAQQAVGNKALTESTLP 

VPT *5fPPKQTTW>JMNPOPTTPTVFT«KrnT AMTTRfiV K'PATWRPT.TjP'K' 

PFPNNRANYNFQVMYNQRYHCPI PKI FYVQLTVGNNEFFGEGKT 
RQAARHNAAMKALQALQNEP I PERS PQNGESGKDMDDDKDANKS 
E I SLVFE IALKRNMPVS FEVIKESGP PHMKS FVTRVS VGEFSAE 
GEGNS KKLS KKRAATTVXjQELKKLP PLP WEKPK\HFFKKRPKT 
I VKAG PE YG QGMNP I S RLAQIQQAKKE KE P D YVLLSERGM PRRR 
E FVMQVKVGNE VATGTG PNKKI AKKNAAEAMLLQLGYKASTNLQ 
DQLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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| SEQ 
ID 
NO: 


Predicted "™" 
beginning 
nucleotide 
i location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 

1 to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

j P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=:Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 

j \»possible nucleotide insertion) 








RHKVISGTTLGYLSPKDMNQPSSSFFSISPTSNSSATIARELU4 
NGTSSTAEAIGLKGSS PTPPCSPVQPSKQLEYIiARI QGFQVHYC 
DRQSGKECVTCLTLAPVQMTFHAIGSS IEASHDQV* YATAILLC 

YGPARKWKAIKMEAMCAHAALLSLIHYLLAPSARLEKSKLFALG 
N- 


5646 


1126 


456 


FS KL IMKT F I IGI SG VTNSGKTTIiAKNLQKHLPNCS VI S QDDFF 

KPESEIETDKNGFLQYDVLEALNMEKMMSAISCWMBSARHSWS 

TDQESAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPYEECKR 

RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYLDGT 

KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS/CK*IRK 
IjQGVI 


5847 


2769 


505 


APEMEDLS S PDS TLLQGGHNLLS S AS FQE S VTF KD V I VD FTQE E 

WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTE 

PWIMEPS I PVGTCADWETRLENSVSAPEPDISEEELSPEVI VEK 

HKRDDSWSSNLLESWEYEGSLERQQANQQTLPKEIKVTEKTIPS 

WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 

KEKSCKCNECGKAFSYCSALIRHQRTHTGBKPYKCN*/CVEKAF 

SRSENIiINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHQRIHTGE 

KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECX3KAFSQRG 

HFMEHQKIHTGEKPFKCDECDKTFTRSTHLTQHQKIHTGEKTYK 

CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 

HQKTHTGEKPYDCAECX3KSFSYWSSLAQHLKIHTGEKPYKCNEC 

GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 

HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 

"SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 

KPYECAECX3KAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 

HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGEKPYK 

CNECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAIiN 
KHQRLHPGI 


584B 

« 


22 


2961 

• 


AAPRRLIiRGGDGDRTPRFPLPALLRPGPPAEAAPERRKMPAVSK 
GDGMRGIiAVFISDIRNCKSKEAEIKRINKELANIRSKFKGDKAL 
DG YS KKX YVCKLL FI FLLGHD IDFGHMEAVNLLS SNR YTEKQ I G 
YIiFI S VLVNSNSEL I RL INNAI KNDLASRNPTFMGLALHC I AS V 
GSREMAEAFAGE I PKVLVAGDTMDSVKQSAALCLLRLYRTS PDL 
VPMGDWTSRWHLLNDQHLGVVTAATSLITTLAQKNPEEFKTSV 
SLAVSRLS\RIVTSASTDLQDYTY*FCPGFIjGLSVKLLRLLQCY 
PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
ISLIIHHDSEPNLIiVRACNQLGQFIiQHRETNLRYLALESMCTLA 
SSEFSHEAVKTHIETVINALKTERDVSVRQRAVDLLYAMCDRSN 
APQ IVAEMLS YLETADYS IREE I VLKVAI LAEKYAVD YTW\ YVD 
TI IiNLIR I AGDYVS EE VWYRVIQ I VINRDDVQGYAAKTVFEALQ 
APACHENLVKVGGYI LGEFGNLIAGDPRS S PLIQFHLLHSKFHL 
CSVPTRALLLSTYIKFVNLFPEVKPTXQDVLRSDSQLRNADVEL 
QQRAVEYLRLSTVASTDILATVLEEMPPFPERESS ILAKLKKKK 
GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 
LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 
RFVCKNNGVLFENQLLQIGLKSEFRQNLGRMFI FYGNKTSTQFL 
NFTPTLICSDDLQPNLNLQTKPVDPTVEGGAQVQQ\AmiECVSD 
FTEAPVLNI QFRYGGTFQNVSVQLP ITLNKFFQPTEMAS QDFFQ 
RWKQLSNPQQEVQNI FKAKHPMDTEVTKAKI IGFGSALLEE VDP 
NPANFVGAG I IHTKTTQ IGCLLRLEPNLQAQMYRLTLRTS KEAV 
SQRLCELLSAQF 


5849 


3545 


1895 


KRRE I KETVFHH VAQAGLELLSSSNP PSSASRS AGITGMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
DEFIREDMKYKDATNKHSHLHREDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWL I E FVELPQ YEKNFRDNNVKGTTLPR I AVHE PS FM I 
SQLKI SDRSHRQKLQLKALD WLFGPLTRP PHNWMKDF I LTVS I 
VI GVGGCW FA YTQNKTS KEHVAKMMKDLES LQTAEQSLMDLQER 
LEKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
CELSRRQYAEQELEQVRMALKKAEKEFELRSSWSVPDALQKWLQ 
LTHE VE VQ YYN I KRQNAEMQLA I AKDEAEK I KKKR S TVFGTLHV 
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SEQ 
ID 


Predicted 
beginning 

Til 1 f~» 1 ■» 
UUViLCULlUC 

location 
corr e sponding 
to first 
amino acid 
residue of 
amino a.o id 

sequence 


Predicted end 
nucleotide 

lOCaLlun 

corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 

fll nhami n JXr**"i d F-Phpnvl al ani tip ri-fjl np 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S =Ser ine , T=Threonine , V* Val ine , 
W=Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon . /snossible nucleotide* dpi f*t* ion 
\=possible nucleotide insertion) 








AHSS SLDEVDHKI LEAKKALSELTTCLRERLFRWQQ I EKI CG FQ 
IAHNSGLPSLTSSLYSDHSWWMPRVSIPPYPIAGGVDDLDEDT 
PPIVSQFPGTMAKPPGSIiARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
I IS /DERYQEMRCP * R I PSGGIL 


5850 


3 


1895 


KAVIjNFSASGSVISLTGSNPMHDASMWHLKKNGIIVYLDVPLLN 
LICRLKIjMKTDRIVGQNSGTSMKDIjLKFRRQYYKKWYDARVFCE 

sgaspeevadkvlnaikryqdvdsetfistrhvwpedceqkvsa 
effieavieglasdgglfvpakefpklscgewkslvgatyvera 
qillercihpadipaarlgemietaygenfacskiapvrhlsgn 

QFILELFHGPTGSFKDLSIjQLMPHIFAQCIPPSCNYMILVATSG \ 

DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQI igsq 

d T7 , T\T^tj A\7^!\rT7 Qr^^ T n'pr i o r P2i t vt? t T7KrncnT7'T i r*'C'T.T , xrwvr^ r rTT CCA 
KtiVi\3fiJ\v\jv nOUr UP t^K J. C viUour 1 vjr LiL VH lol ±Ltaa*\ 

NSINWGRLLPQ\A/YHASAYLDLVSQGFISFGSPVDVCIPTGNFG 
NILAAVYJUCMMGIPIRKFICASNQNHVWTDFIKTG\HYDLRGKE 
N* AQTFFTVQ * I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKl^QDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAKWADRVQDKTCPVI ISSTAHYS KFAPAIMQALKI KEI 
NETS S S QLYLLGS YNALPPLHEALLERTKQQEKME YQV(17^ADMN 
VLKSHVEQLVQNQFI 


5B51 


3120 


1802 

• 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGS PSVMTRAGDHNRQ 
RGCCGS LAD YLTS AKFLLYLGHS LS TWGDRMWHFAVS VFLVEL Y 

QNVSVIIiCGI I LMMVFLHKH E LLTMYHG WVLT SCYILIITIANI 
ANLAS TATAI TIQRDWI WVAGEDRS KLANMNATIRR IDQLTNI 
LAPMAVGQ I MTFGSPVIGCGFISGWNLVSMCVE YVLLWKVYQKT 
PALAVKAGLKEEETEXKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELEHEQEPTCASQMAEPFRTFRDGWVSYYNQPVF/LGWHGSCFP 

LAT S KMWFG S DRS DLRIGTAFL FDLVCDLCIHAWKP PGLVR FS F 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDS I SRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGPFITSGPG/WFRQ 
YYFFISGRH*VLFTESDFYYVAMDFGGHGLSSHYSPGVPYYLQT 
FVSE I RRWAGKKQSVYFRRCGGCSRAP PLITGGGVGSRKQRWP 
ESGAWALAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGSDLRPR 
pvcr/pni.Tr xick + aanrjpniA wQvrrr.PTi'fiT.nnX dpt.tA pp t pup 

r VlDd lvjXji.iJVV_i\."^irt.yolry v \rio v i\_ui_ r uijovs \ yv-i-itt \r jf JLr ivtr 

LLLHPRRPRLHPGTRGVAVEPHALRVVHVAHGEEAGIRAAGPGH 
GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 
SRWVRGN FRTGTAATLIG FS RNP TLNGS ENWGSLVS I QEEG PDT 
GWEREKRNPAEMGNPQRWASPIHTPPLGPEILRAMPEAIiRAMPE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
T.SSIiCITESPSONWTPCLIiLLTCPRGLF 


5854 


86 


938 


KG RNTAP EKKGAALNNRENAS S * NGY / S RWKQDI RR I ENHI IQE 
LKHLCAM I KRVLLERLENTRKLRELTEGRTLDWPQNRI TEVS AK 
RQ I VTE YRE KGKRN * EE KKRD LEGRSRRYNLC 1 1 G I PETEDRAS 
GAET I KDLLE/ENFPELKNELDLQMEKAHR I PLKFNE KKAASRH 
IRVTFL/ KFQRRNILQASSQRKQVTYKGAKVRLTSDFSPAI lna 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINGELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRS YGC KAP SRI SHLHK \ FL FLLLP S LLMG YS ES P P P I TDS WAP 
FISLTHHVLSQSQSPLSSNCWICLSTHTQ*FTALPADLLTWTQS 
NVS IiHI S YLAI PFLADS FLKPV/ L * PGNS AKHLS FKLS SLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTQTSFISPPP 
LCLSRTYPNPAHATMVGQVPQS LCGLI FTL/RTPCRP S I LHPNY 
KI I STSAWQKVLCFSGSPTIHTSLHLTTGSSFLSFHP I PGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT+QPPHRGSN/RLTVBKDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X=Onknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLS P KPNSLHQLPSQ \TP YQALTGAALAGS YP I WENENTLS WL 
PTFTYNFCliSTPSLFFLCDTN+YLCLPANWSGTCTLVFQAPTIN 
ILPPNQTIL1SVEASISSSPIRNKWALHLITLLTGLGITAALGT 
GIAGITTS ITS YQTLFTTLSNTVEDMHTS ITSLQRQLDFLVGVI 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFIiLLMIGP 
CIFNLVSRFISQRLNCFIQASMQKHIDNIFHLCHV*YQSLRGNH 
SEAPEPRP 


5856 


173 


1137 


PWLHGLGLS AVFLFYL* / YVTFHLYGGI ILLLLIFI S IAGILYK 
FQDVLLYFPEQPSSSRLYVPMPTGIPHENIFIRTKDGIRLNLIL 
IRYTGDNS PYSPTI I YFHGNAGN1GHRLPNALLMLVNLKVNLLL 
VDYRG YG KS EGE AS E EGL YLDS EAVLD YVMTS P DLDKTK I YLSG 
RSLG\GAAAIHLASDNSHRISAIMVENTFLSIPHMASTLFSFFP 
MRYLPLWCYKNKFLS YRKISQCRMPSLFI SGLSDQLI PPVMMKQ 
LYELSPSRTKRLAIFPDGTHNDTWQCQGYFTALEQFIKEWKSH 
SPEEMAKTSSNVTII 


5857 


1597 


1 563 


KLIGKVLVLS WADAMAAFAVE PQGPALGSEPMMLGS PTS PKPG 
VNAQFLPGFLMGDLPAP VTPQPRS ISGPS VGVMEMRS PLLAGGS 
PPQPWPAHKDKSGAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDS LTS EDH \ LDDS WGDC I WGFLKAS A\S Y I LL \QFAQ YGG IS* 
NMWMSNTGNWMH I RYQS KLQARKALS KDGR I FGE S IM IGVKPCI 
DKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSDYQVISDRQTPKKDESIiVSKAMEYMFGW 


5858 


j 355 

i 


1419 

■ 

• 


PPHQPAAASTSXHQQQQP PP PPQDSS KP WAQGPGPAPGVGSAP 
PAS SSAPPATP PTSGAP PGSGPGPTPTPPPAVTSAPPGAP PPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGH PKP PHRGGGE PRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 
LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 
ALA+NCPKPELG*YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 
MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCIiN 
FAS 


5859 


307 H 

* * 1 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAEVQGKYVKKETS PLLR 
NLMPS FIRHGPTI PRRTDI CLPDSS PNAFSTSGDGWSRNQS Fit 
RTP IQRTPHE I MRRESNRLS APSYLARSIiADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGSEDLPLPPGWS 
VDWTMRGRKYY IDHNTNTTHWSHPLEREGLPPG WERVESSE FGT 
YYVDHTNKKAQY\RHPCAPTCTSV*STTSCHI/AS/RQQTERNQ 
SLLVPANP YHTAE I PDWLQVYARAPVKYDH ILKWELFQLADLDT 
YQGMLKLLFMKELEQ I VKMYEAYRQALLTELENRKQRQQWYAQQ 
HGKNF 


5860 


2956 

1 


1270 


TIRVEEFPLCPGGGKAQLSSASLLGAGLLLQPPTPPPLLLLLFP 
LLLFSRLCGALAGPI I VEPHVTAVWGKNVS LKCL I E VNETI TQI 
SWEKIHGKSSQTVAVHHPQYGFSVQGEYQGRVLFKNYSLNDATI 
TLHNIGFSDSGKYICKAVTFPLGNAQSSTTVTVLVEPTVSLIKG 
PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETATIISQYKLFPTRFARGRRITCWKHPALEKDIRYSFILDI 
Q YAPEVS VTG YDGNW FVGRKGVNL KCNADANP PP FKS WS RLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 
VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKIJPVNITIjIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRWMARLL 
SEGEQGIPTACAAFAQQPAG/EPRRGLAGVGEGGPQCSWVNYRC 
TLEFLVSLLGTDLARGRGNSASGPTAPADSKQL/ML*DVHRRVI 
LE * RMNSGS PARDNAPSQRFCTNLS EGLRFG IS P S WREALYGCH 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, rssFnenyiaiamne, c»=(jjLycine, 
H=HiSLiaine, i=isoxeucine, N«ijysme f 
L=Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


1556 


483 


PP FQLI MGE I KVS PDYNWFRGTVPLKKI I VDDDDS KIWSLYDAG 
PRS I RCPLI F LP PVSGTADVr r RQ 1 JjAJj 1 (jrWG X K.V1 AUQ Y F v IW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHSLI LCNS FSDTS I FNQTWTANS FWLM PAFMLKKI VLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RDI PVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLKTGGNFPY 
LCRSAEVNLYVQIHL/R /RNSME PNTRPLTHQWS VPRS LRCRKA 
AIiASARRSSSVSIiAVNDE1jTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPLAAPREDTMGPLMVLFCLLiFLYPGLADSAPSCPQN 

vnisggtftlshgwapgslltyscpqglypspasrlckssgqwq 
tpgatrslskavckpvrcpapvsfengiytprlgsypvggnvsf 
ecedgfi\lrgspvrqcrpngmwdgetavcdngaghcpnpgisl 
gp\vrtgfrfghgdkvryrcssnlvltgsserecqgngvwsgte 
picrqpysydfpedvapalgtsfshmlgatni^tqktkeslgrki 
qiqrsghlnlyllldcsqsvsendflifkesaslmvdrifsfei 
nvsvai itfasepkvlmsvlndnsrdmtevisslenanykdhen 
gtgtntyaju^svylmmnnqmrllgmetmaw\qeirhaiill\t 
dgk\shmggspktavdhireilninqkrndyldiyaigvgkldv 
dwrelnelgskkdgerhafilqdtkalhqvfehmldvskltdti 
cgvgnmsanasdqertpwhvtikpksqet\c\rgalisdqwvi,t 
aahcfrdgndhslwrvnvgdpksqwgke fliekavi spgfdvfa 
kknqgil\efygd\diall\klaqkvkm\sthcqgpsclp\ctm 
\eanlgflretfkgstcr\dheneii/vwnkqsv\pahf\vaij\n 
gsklehltlrmgvewtsccrglspkkktm\fpnlt\dvre\vvt 

D\QFL\ CS \GPQEDESP \ CK* E\SGGA\ VFLERRFRLSAGGWC 
SWGL\YWP\CLGSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q* S PWLRQHPGGMS * I FLPLLANGHLS P FACPARI CRPLHFLPS 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPliLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPSIPPSSPLACVLKNLKPLQLTPDLKPKCLIFFCNTAWPQY 
KIjDNDSK*PENGTFEFSILQVLDNSCHKMGKWSEVPDVQAFF\S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
S1^FHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTIVGCI F FKTAI I SHFKGGMYLCVCMCTC 
LS VCVCVQVGSWI CV/ CVSMCACVSLCTC \ ICRCI SMYTREHAC 
ACniV*VYMCMS/VCTCVSTCIDVRVCAHVar!fMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/ CVY VCVLCVWACMRMSTCVWLVYG * ACTCVWMKM / CSCTCR / C 
VHVCCM S MHACECLCVYLH I CG CAGTRRW WAGSARGS RS CS RLP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVS PRRGADCFEAPDVPKQPPGWGRAS FEERGC 
GGRGWVCAPPLNGPQCCCFS I KPELKAKKKK 


5866 


98 


3197 

• 


ARPE V PAP P AWLSRRGAAKMGDKKDD K0S P KKNKGKERRDIiDDL 
KKEVAMTEHKMS VEE VCRKYNTDC VQGLTHSKAQE I LARDGPNA 
LTPP PTTPEWVKFCRQLFGGFS I LLWIGAI LCFLAYGI QAGTED 
DPSGDNLYLGIVLAAVVTITGCFSYYQEAKSSKIMESFKNMVPQ 
QALVIREGEKMQVNAEEVWGDLVEIKGGDRVP7UDLRI I SAHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGWVATGDRTVMGRIATLASGLEVGKTPIAXEIEHFIQIilTGV 

7\Trt?T r\rci7T?TT ct t T f vtTaIT.t? ftTTT TTT.TfZT T^7&KI\rDTrfJTiT«2XT T \7 T l*V 
AVr JjGVolf c ILoljILoI I WJjejAV irLjlvji. J» vhjm v riiuDiirti viv 

CLTLTAKRMARKNCLVKNLEAVETLGSTS T ICSDKTGTLTQNRM 

TVAHMW FDNQ IHEADTTEDQS GTS FDKS S HTWVALF * H / LLGF C 

NRPVFKGGQDNI PVLKRDVAGDASESALLKCIELS SGSVKLMRE 

RNKKVAE I PFNSTNKYQLS IHETEDPNDNRYLLVMKGAPERILD 

RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 

P E EQFP KG FAFDCDD VN FTTDNLCF VGLMSMIGPP RAAVPDAVG 

KCRSAG I KVIMVTGDHP I TAKAI AKGVGI I FEGNETVEDIAARL 



390 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


NIPVSQVNPRDAKACVIHGTDLKDFTSEQIDEILQNHTEIVFAR 
TSPQQKLI IVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGI 
AGSDVS KQAADMI LLDDNFAS I VTGVEEGRLI FDNLKKS IAYTL 
TSNIPEITPPLLFIMANIPLPLGTITILCIDLGTDMVPAISIiAY 
EAAESDIMKRQPRNPRTDKLVNERLISMAYGQIGMIQALGGFFS 
YFVIl^NGFLPGNIiVGIRLNWDDRTVNDIjEDSYGQQWTYEQRK 
WEFTCHTAFFVS I VWQWADL 1 1 CKTRRNSVFQQGMKNKILI F 
GLFEETAIAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDE I RKL I LRRNPGGW VEKET YY 


5867 


3 


1485 

I 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSS P VAKP 
GPVTCTLTRKKNKKKKRFWKSKAREVSKKPASGPGAVVRPPKAPE 
DFSQNWKALQEWLLKQKSQAPEKPIiVISQMGSKKKPKIIQQNKK 
BTSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDI VPERGDIEHKKRKAK\GQPQPHPPR/ IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALALDCEMVGVGPKGEESMAARVS I VNQYGKCVYD JCYVKP 

TEPVTDYRTAVSGIRPENLKQGEELEVVQKEVAEMLKGRILVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
I LGLQVQQAEHCS IQDAQAAMRLYVMVKKE WESMARDRRPLLTA 
PDHCSDDA*QS CPAAAAAPLQRQCDQSQGQ I TS PQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


1 833 


LTAGASHTQDAS QSTSAKYPAAAQNL/ CVTNAMREDLADIWYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTS WTEDEDFS I LLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREY5fSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGST^LGVCLHTSSSGLDLPMKVVDMFG 
CCIiPVCAVNFKCLHELVKHEENGLVFEDSEEIiAAQIjQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 | 

i 


833 


LTAGASHTQDAS Q S TS AK YPAAAQNL / C VTNAMREDLAD I W Y I R 

AVTVYDKPAS FFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 

TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFS ILLAA . 

LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 

IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 

CCLPVGAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 

DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 | 

* 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADIWYIR 
AVTVYDKPAS FFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAJTERDAGSGLVTRLRERPALLVSSTSWTEDEDFS ILLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 

• 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 

VLKLL*LSLRRL*LEPTI*NGLLT*CSRLSVFRFLKV\GSVYEP 

LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 

CGGDQKAJCIQDS L YCAAGAWALALAYRRI DDDKGRTHELEHSAI 

KCMRG I L YCYMRQADKVQQ FKQDPRPTTCLHS VFNVHTGDELLS 

YEEYGHLQINAVSLYLLYLVEMISSGLQIIYNTDEVSFIQNLVF 

CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 

L*KQFNGFNLraNQGCSWSVIFVDLDAHNRNRQTLCSLLPRESR 

SHNTDAALLPCISYPAFALDDEVLFSQTLDKWRKLKGKYGFKR 

FLRDGYRTSLEDPNRCYYKPAE I KLFDGI ECEFPI FFLYMMIDG 

VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 

KNNPGSQKRFPSNCGRDGKLFLWGQALYI IAKLLADELISPKDI 

DPVQRYVPLKDQRNVSMRFSNQGPLENDLWHVALIAESQRLQV 

FLNTYGIQTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 

PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 

DI KNALQF I KQ YWKMHGRPLFLVL I REDN I RGSRFNP ILDMLAA 

LKKGIIGGVKVHVDRLQTLISGAWEQLDFLRISDTEELPEFKS 

FEELEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 

QKLNDCS CLASQAI LLG ILLKREG PNF I TKEGTVSDH IERVYRR 
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SEQ 
ID 
wu : 


Predicted 
beginning 
nucieouioc 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
biucainic aciu, r-rnenyxaianine , ti=i3_Lycine, 
H=*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine f N=Asparagine, 
P=Proline, Q=Glu t amine , R^Arginine, 
S=Serine, T=Threonine , V=Valine, 
W^Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


ACjoy iUjWo V VKKAAoJjJjo IxVVL/oliAfc'S JL 1 N vli Vytslvy V 1 LiW»Ar t» 
HEEEVISNPLSPRVIQNIIYYKC^NTHDEREAVIQQELVIHIGWI 
ISNNPELFSGTLKIRIGWIIHAMEYELQIRGGDKPALDLYQLSP 
SEVKQLLLDILQPQQNGRCWLNRRQIDGSLNRTPTGFYDRVWQI 
LERTPNGIIVAGKHLPQQPTLSDMTMYEMNPSIibVEDTLGNIDQ 
JrUX Kyi V V JaJuJuM V Vo ± V JjilKIN rCtijCtrUUiVViJljDKijViUjJ\J?iNarU 
KDQSRLKE I EKQDDMTS FYOTPPLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDPCLIS 


5872 


68 


665 


VQGYMYRFVIKINSCYSEKTSICRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI \SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFTRIIPG\LC^C3GDFTHHNGTGGKSLYSKEFDDENFI/IjKH 
TAPGVLSTANAGPTTNGSQFF I CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGKTSKKITAANCGQL 


5873 

• 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GE CVG PNKCRC F PG YTGKTCS QD VNE CGMKPRP CQHRCYNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCliCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
ivy U Y KAjJN (jJjKL. £> AI PrUN o V IUsVIjKAPGTX JvuKI KKbLiAnKNbMK 
KKAKI KNVTPE PTRTPTP KVNIO P FN YEEIVS RGGNSHGG \ KKG 
NEEKMKEGLEDEKREEKAbKD*HRRERPFRG\DVFFPKVNEAGE 
FGIiI L\ VQRKALTS KLEHKADLNI S VDCS FNHG \ I CDW\ KQDR\ 
EDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDLQPQSNFCLLFDYRIAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKWKTGKIQLYQGTDATKS 1 1 FEAERGKGKTGE I AVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRLARRRRRVRSLRRRRGWLRARWSRGQNNMAARRITQETFD 
AVLQEKAKRYHMDASGEAVSETLQFKAQDLLRAVPRSRAEMYDD 
VHSDGRYSLSGS VAHSRDAGRESLRSDVFSGPS FRSSNPS I SDD 
SYFRKECGRDLEFSHSNSRDQVTGHRKLGHFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
SRDYDVDHSG\EA\DSVLRGS\SQVQA\RGRALNIVDQEGSLLG 
. KGETQGLLTAKGGVGKLVTLRNVSTKKIPTVNRITPKTQGTNQI 
QKNTPS PDVTLGTNPGTED IQFPIQKI PLGLDLKNLRLPRRKMS 
FDI IDKSDVFSRFGI E 1 1 KWAGFHT I KDDI KFSQLFQTLFELET 

etcakmlasfkcslkpehrdfcfftikflkhsalktprvdnefl 
nmlldkgavktkncffei ikpfdkyimrlqdrllksvtpllmac 
nayelsvkmktlsnpldlalalsttnslcrkslallgqtfslas 
s frqekil* avglqd iaps paafpnfedstlfgrey idhlkawl 
vssgcplqvkkaepepmreeekmipptkpeiqakapsslsdavp 
qradhrwgtidqlvkrviegslspkertllkedpaywflsden 
sleykyyklkiiaemqrmsenlrgadqkptsadcavramiiysrav 
rnlkkkllp\wqrrgllraqg\lrg\wkarra\ttgtqtllflr 
apglkhhgrqapgls \qakpslpdrnd\aakd\cpldpv\gpsp 
qdpsleasgpspkpagvdiseapqtsspcpsadidmkdngrtae 

rr-r n T3T7riTT\r\\Tr*\ OT?TW\T?\ OT\ T?MC TTlM'DFiT HPT \ tIT»/^T\TCe\ TVPtT 

K 1 tAK* VA y Vtt\ PH. 1 K^jH \bl lj\llijyj>li>i> \At Js. 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEBEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPSLEGSTPADGLPGEA\AEDDL/ALGAPAI»FTGLLQVTCFPFG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLDFAQQKLXTDKXNIiGFQXMLQKMGWKEGHGLGSLGK 
GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
TFVF 


5875 


296 


1848 


LAALGGLPLMRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LEFSGSLFPHAICLGDVDNDTLNELWGDTSGKVSVYKNDDSRP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKQHIPANTKVMLISDIDGDGCREL 
WGYTDRWRAFRWEELGEGPEHLTGQLVSLKKWMLEGQVDSLS 
VTLGPLGLPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDGS 
/SGDPSCPRRGAAPD1WPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline , Q=Glut amine , R=Arginine , 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XeUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLFALCTLDGTLKLMEEMEEADKLLWSVQVDHQLFALEKLDVTG 
NGHEE WACAWDGQT Y I IDHNRTVVRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL*LVPCPTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 


5876 


1122 


224 


HLPLGVPSKVAGAAAMEPQEERETQVAAWI,KKIFGDHPIPQYEV 
NPRTTEHjHHLSERNRVRDRDVYIiVIEDLKQKASEYESEAKYLQ 
DLLME S VNFS PANLS S TGS R YLNAL VDS AVALETKDTSLAS F I P 
AVNDLTSDLFRTKSKSEEIKIELEKLEKNLTATLVLEKCLQEDV 
KKAELHLSTER\AKVDNRRQNM\DFLKAKSEEFRFGIQAAGEQL 
SARGQ\DAFS VP IQSLVALIRENWPRLKQQTI PLK\ KKLiES YLD 
LMP\NPSHCSK+RIEEAK\RELA\SIEAELTRRVS\MMEL 


5877 


2030 


1907 


GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLEV 
LSRELIEMLAISRNQKLIiQAGEENQVLELLlHRDGEFQELMKLA 
LNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQILATAVYQ 
AKE KL KS I E KARKGA I S S EE 1 1 KYAHR I S ASNAVCAP LT WVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR* INI ILILQKSVCEL 


5878 

* 


950 


2113 


GLWKCMQLQGPHTHRVQP * PTPRQQGPQ \ VPVAVIAGNRPNYLY 
RMLRSLLSAQGVSPQMITVFIDGYYEEPMDVVALFGIiRGIQHTP 
ISIKNARVSQHYKASLTATFNLFPEAKFAWLEEDLDIAVDFFS 
FLSQS I HLLEEDDSLYC I SAWNDQGYEHTAEDPALLYRVETMPG 
LGWVLRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLNmGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
ME KDDD FTTWTQLAKCLH I WDLDVRGNHRGLWRL PRKKNHFLW 
GVPAS P YS VKKPPS VTP I FLEPPPKEEGAPOAPEQT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGSPPTIiLPLSPTSPRCAATMASSDED 
GTNGGAS EAGEDREAPGKRRRLGFIiATAWLTFYD IAMTAGWL VL 
AIAMVRF YME KGTHRGIiYKS IQKTLKFFQTFALLE IVHCLIGIV 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESVVLFLVAWTVT 
EI TRYS F YTFSLLDHLP YF I KWARYNFFI ILYPVGVAGELLT I Y 
AALPHVKKTGMFS IRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG\G*L*KRMIK*SLQTRCFFQNNQDYLSPSF 
NNKNKQLCEISWIVWFLKI 


5880 


1138 


1324 


SLWCLVAGGLGLGPS SQNPXjQRAG I IiARPREARGTFSALTACS A 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
♦KKKRGRCSS/WLSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSEHTDGHTS VQS VI EKLQEENRLLKQKVTHVEDLNAKWQRYN 
ASRDE YVRGLHAQLRGLQI PHEPELMRKEISRLNRQIiEEKINDC 
AEVKQELAASRTARDAALERVQMLEQQ I LAYKDDFMS ERADRER 
AQSRIQELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEELLRHVAECCQ 


5881 


26 


441 


GGI HPSPTEAPRAQHLTMDCTWRI LFLVAAATGTHAQVQLLQSG 
SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGLE*MGPFD 
LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 


5882 


2407 


2216 


SGCVEMLYSHSLEYNPEWISVQSAVAPAQLAliNSDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYS KQVELELQQ I EQKS IRDYI QESENIASLHNQ ITACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELBAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIRBFILQKIYSFRKPMTNYQ 
IPQTALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLSYYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 
TLGTRGSV I SPTELEAP I LVPHTAQRGEQRYPFEALFRSQHYAL 
LDNS CRE YL F I CEF FWSGPAAHDLFHAVMGRTLS MT LKHLDS Y 
LADCYDAIAVFLCIH I VLRFRNIAAKRDVPALDRYWEQVLALLW 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Prol ine , Q=Glutamine , R=Arginine , 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Coaon, /s=possiole nucleotide deletion/ 
\=possible nucleotide insertion) 








PRFELIIJEMNVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSALV 
SINQTIPNERTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVPLI 
NNYDMMLGVLM\E*ERAADDSKEVESFQQLLNARTQEFIEELLS 
PPFGGLVAFVKEAEALI ERGQAERLRGEEARVTQL IRGFGSSWK 
SSVESLSQDVMRSFTNFRNGTSIIQGALTQLIQ\LYHRFHRV\L 
SQPQLRALPARAELINIHHLMVELKKHKPNF 


5883 


2 


13 74 


E FPGRR FRAVME AGAGAGAGAAGW S CPGPGPTVTTLG S YEASEG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEBEEEEEEEEEVE 
EEEEQVQKGGSVGSLSWKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLELQGLLEDERLASAQQAEVFTKQIQQLQG 
ELRSLREEISLLEHEKESELKEIEQELHLAQAEIQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
M KSS E PS GS LGLS D YSGLQEELQELRER YHFLNEE YRALQESNS 
SLTGQLADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 
EQRRLQRELKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


5884 

■ 


4251 


2522 

* 


GVLARAS ARLRVPLTGVRACAE P E VGAE PAKVAGAAE PDEDGGR 
S RLRDCGD YT PS ERLG P KGAMLW FQGAI PAAI ATAKRSGAVF W 
FVAGDDEQ S TQMAAS WEDDKVTEASSNS FVAI KIDTKSEACLQF 
SQ I YPWCVPSSFFIGDSGI PLEVI AGS VSADELVTRIHKVRQM 
HLLKSETSVANGSQSESSVSTPSASFEPNNTCENSQSRNAELCE 
IPSTSDTKSDTATGGESAGHATSSQEPSGCSDQRPAEDLNIRVE 
RLTKKLEERREEKRKEEEQREIKKEIERRKTGKEMLDYKRKQEE 
ELTKRMLEERNREKAEDRAARERIKQQIALDRAERAARFAKTKE 
EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQ F P SDAP LEE ARQFAAQTVGNTYGNFS LATM FPRREFTKEDY 
KKKLLDLELAPSASWLLP/ALFINF*AGRPTASIVHSSSGDIW 
TLLGTVL YP FLA I WRL ISNFLFSNPP PTQTS VRVTSS E PPNPAS 
SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


58B5 


900 


467 


AAGGGRRSRLSRSWPTGPSKSPSGVRCCG\RR\AWEDKDEFLDV 
IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN ; 
YLQ IDEEEYGGTWELTKEGFMTS FA/ IVHGHLDHLLHCHPL * LM 
VYSSQVLPIQSKGPS 


5886 


86 


1341 


PFRGRALTLKKQPRPGVAPPSLGTCHKSDPGRPAAQSQPPSPGS 

GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 

EVLLEALFLTVD P YMRVAAKRLKEGDTMMGQQVAKWES KNVAL 

PKGTI VLASPGWTTHS I SDGKDLEKLLTEWPDTI PLSLALGTVG 1 

MPGLTAYFGLLE I CGVKGGE TVtWNAAAGAVGS VVGQIAKLKGC 

KVVGAVGSDEKVAYLQKLGFDWFNYKTVESLEETLKKASPDGY 

DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 

PPEIGIYQELRMEAFWYRWQGDARQKALKDLLKWVLELPYFVI 

D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 

NMPAAFMGMLKGDNLGKTIVKA 


5887 


1937 


104 


APGCRG CRATRCPCRGPR WDSLGDEAARS PAAPGGAPGLLGLRE 

RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 

PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 

ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 

FCIHITN\*NLHYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 

VSFTTKLDIPTAAKYEYGVPLQTSDSFLRFPSSLTSSLCTDNNP 

AAFLVNQAVKCTRKTNLEQCEE IEALSMAFYSSPE ILRVPDSRK 

KvPITVQSIVIQSLNKrLlKREDIDVLyPlLW 

LE VKYS LTYTDAGE VTKADLS FVLGTVS S WVPLQQKFE IHFLQ 

ENTQPVPLSGNPGYWGLPLAAGFQPHKGSGIIQTTNRYGQLTI 

LHSTTEQDCLALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVAQ 

KVKSLLWGQGFPDYVAPFGNSQGP/ADMLDWVPIHFITQSFNRK 

DS CQLPGALVI E VKWTKYGS LLNPQAKI VNVTANL I S S S FPEAN 

SGNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 

V 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
| amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid' 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
M»wisciQine, i=isoxeucme, K=Lysxne, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrasine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5888 


375 


2302 

* 


LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG 
LELHPDYKTWGPEQVCSFLRRGGFEEPVLLKNIRENBITGALLP 
CLDESRFENLGVSSLGERKKLLSYIQRLVQIHVDTMKVINDPIH 
GHI ELHPLLVRI IDTPQFQRLRYI KQLGGGYYVFPGASHNRFEH 
SLGVGYLAGCriVHALGEKQPELQISERDVLCVQIAGLCimLGHG 
PFSHMFDGRFIPLARPEVKWTHEQGSVMMFEHLINSNGIKPVME 
QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
I VSNKRNGI DVDKWDYFARDCHHLG I QNNFDYKRFIKFARVCEV 
DNELRICARDKEVGNLYDMFHTRNSLHRRAYQHKVGNIIDTMIT 

daflkadd yieitgaggkkyri s taiddmeaytkltdnifleil 
ystdpklkdareilkqieyrnlfkyvgetqptgqikikredyes 
lpkevasakpkvlldvkiikaedf ivdvinmdygmqeknpidhvs 
fycktapnrairitknqvsqllp\ekfaeq\lirvyckkvdrks 
lya\arqyfvqw\cadr\nft\kpqdgrcy*pptp*hpqkkgw\ 
ndstfspki ptrlprrlpksrv\qlfkddpm 


58B9 


1831 


731 

• 


LPAACGRPVTARPRQAPEGRSGRPRDLDPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVI IAGNNDSKAKQWSKIKEET 
LjnJJ1\±i i * VJjIiCCPGWIjCiiWNSSDPPTSASRGAGTTGVHHHFLLK 
FGIFIL\DLASMTSIRQFVQKFKMKKIPLHVLINNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTLKESGS PGHSARWTVS 
SATHYVAELNMDDLQS S ACYS PHAAYAQSKIiALVLFTYHLQRLL 
AAEGSHVTANWDPGVWTDLYKHVFWATRL^^ 
DEGAWTS I YAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWSKSCEMTGVLDVTL 


5890 


1322 

» * * 


200 


FRRGWSAAGRAVP VAFCS R I SASS PRRPRGAVRIiQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LEWKTRIjQS S SVTJjY I se vqlntmagas vnr WSPGPLHCLKV 
I LE KEGPRS L FRGLGPNLVGVAPSRAI YFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPI WLI KTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDEES VKEASDFVGMMLAAATS K\ LVATT I 
AYPHEWRTRLREEGTKYRS FFQTLSLLVQEEGYGS LYRGLTTH 
LVRQ I P \NTAIMMAT YELWYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVP VAFCSR I S AS SPRRPRGAVRLQSGTEAACRS | 

GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 

LEVVKTRLQSSSVTLYISEVQLNTMAGASVNRVVSPGPLHCLKV 

IbEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 

PDSTQVHMISAAMAGFTAI TATKfPIWLI KTRLQL* /SQGTAGKR 

RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVTYESI 

KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTi 

AYPHEVVRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 

LVRQI P \NTAIMMATYELWYLLNG 


5892 


1764 


379 


WLRVCGRLSVNSAVSSRTGGWSAGLTCAMQRLQWLGHLRGPA 
DSGWMPQAAPCLSGAPHASAADWWHGRRTAI CRAGRGGFKDT 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSD I PETVPLSTVNRQCSSGLQAVAS I AGGIRNGS YDIGMA 
Laj V Jbb M fa JjAUKUN PCjN X 1 £> KLM E KS KARDCL I PMG 1 TS ENVAER 
FGISREKQOTFALASQQKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRSITVTQDEGIRPSTTMEGIAKLKPAFKKDGSTTAGNSSQVS 
DGAAAILLARRSKAEELGLPILGVLRSYAWGVPPDIMGIGPAY 
AI P VALQKAGLTVSDVD I FEINE \AFASQAAYCVEKLRLP P * EG 
* TPLGGASGP *GHPLGLHWGHVQVI TLAQ * S * SARGKRAYRSGC 
PCAIGSWNGSPLPVFEYPWGT 


5893 


3 


1653 


I LS KRRCQKAKTKELMAKKVAV I GAGVSGL I S LKCCVDEGLE PT 

MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKWTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWMIEQQMNRWFNHENYGLEPQNKYIMKEPVLNDDVPSRLL 
CGAIKVKSTVKELTETS AI FEDGTVEENIDVI I FATGYSFS FPF 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine , V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



LEDS LVKVENNMVS L YKY I F PAHLD KSTLAC I GL I Q PLGS I FPT 
AELQARWVTRVFKGLCSLPSERTMMMDIIKRNEKRIDLFGESQS 
QTLQTNYVDYLDELALEIGAKPDFCSLLFKDPKLAVRLYFGPCN 
S Y* YRLVG PGQWEGARNAI FTQKQR I LKPLKTRALKDSSNFS VS 
FLLKILGLLAWVAFF\ CQLQWS 



5894 



174 



1673 



R YS P KKVLQNKES SLKLGMATALVS AHS LAP LNLKKEGLR WRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFL I ILPKELQARVQEH 
HPESREDVWVLEDLQLDLGETGQQVDPDQ PKKQKI LVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVES 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS*ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
IHTGEKPYLCIHCGKNFRRSSHLNRHQRIHSQEEPCECKECX3KT 
FSQALLLTHHQRIHSHSKSHQCNECGKAFSLTSDLIRHHRIHTG 
EKPFKCNICQKAFRLNSHIiAQHyRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 



5895 



2967 



86 



HPS LLGAI PFYPPPS S PWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVIXSSVRRTLTHIDHSLS 
RQ\NCPFIiAGETESLADIVbWGALYPLIiQDPAYIiPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDE YGTATETKAL\EEGLTPQE ICDKYHI IHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\IiADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDIiK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM \ AKDNVP FHS L VFPS SALGAEDNYTL \ VSHL I ATE YLN YEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\ HVTLELQHYHQ\LLEKVRIRDALRS ILTI S \RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGIAVNIAALLSVML 
QPYMPTVSATI QAQLQLPP PACS ILLTNFLCTLPAGHQ IGTVS P 
LFQKLENDQ I ESLRQRFGGGQAKTS PKPAWETVTTAKPQQ IQA 
LMDE VTKQGN I VRELKAQKAD KNEVAAEVAKLLDL KKQLAVAEG 
KPPEAPKGKKKK 



5896 



2967 



86 



HPS LLGAI PFYP PPSS P WPP PLYLFWNSHRKSRHF INQRGIHGE 

MRL FVSDGVPG CL P VLAAAGRARGRAE VLIS TVG PE DCWP FLT 

RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 

EATELQPTLSAALYYL\VVQGKKG\EDVIjGSVRRTLTHIDHSLS 

RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 

FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 

EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 

NPVLP VAGERNVL I TSALP YVNNVPHLGNI IGCVLSADVFARYS 

RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 

DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 

TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 

NAVBLKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 

K\ FS KSRGVG V FRDM\ AHDTG I PPD I S RFYL\L Y I R PEGK\ DS A 

FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 

LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 

GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 

QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 

LFQKLENDQIE S LRQRFGGGQAKTS PKPAWETVTTAKPQQ IQA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
I^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) j 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVIiQLDSGNYLFSTSAI CR YFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYIjPEELSALHSW 

fqtlstq\epcqr\aarrlvlkq\qgvlalr\pylqkqpqpspa 

egkglspiepeeeelatlseeeiamavtawekgleslpplrpqq 

npvlpvagernvlitsalpyvnnvphlgniigcvlsadvfarys 

rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 

diy\rwfnisfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 

tveqi^cehcarf\lwdrfvegvcpfcgyeeargdqcdkcgkli 

navelkkpqckvcrscpwqssqhlfldlpklekrleewlgrtl . 

pgsdwtpnaqfitpffgfrewpskprwq*trdlk\wgnpgtp*e 

gfedk\vfyvwfdatigyls i tanytdqwerww \knpeqvdl yq 

fm\akdnvpfhslvfpssalgaednytl\vshliateylnyedg 

K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQ YI \QVNE P W\ KRI KGSEADRQRAGTVTG LAVNI AALLSVML 
QP YMPTVSAT IQAQLQLPPPACS I LLTNFLCTLPAGHQI GTVS P 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMDEVTKQGNI VREIiKAQKAD KNE VAAEVAKIiLDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 

n 


2967 


86 

■ 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 

MRLFVSDGVPGCLPVLAAAGRARGRAEVLI STVGPEDCWPFLT 

RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 

EATELQPTLSAALYYL\ WQGKKG \EDVLGS VRRTLTH IDHS LS 

RQXNCPFIiAGETESIiADIVLWGAIiYPLLQDPAYLPEELSALHSW 

FQTLSTQ\EPCQR\AARRLVLKQ\QGVIiALR\PYLQKQPQPSPA 

EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 

NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 

RLRQWNTLYI*CGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 

DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 

TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 

NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ | 

FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 

K\ FS KSRGVGVFRDM \ AHDTG I P PD I SR F YL\ L Y IRP EG K\ DS A 

FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 

LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 

GNQ YI \QVNEPW\KR IKGS EADRQRAGTVTGIiAVNIAALLSVML 

QPYMPTVSATIQAQLQLPPPACS ILLTNFLCTLPAGHQIGTVS P 

LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 

LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDDKKQLAVAEG 

KPPEAPKGKKKK 


5899 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEEIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILMSTMRNQARLKVLRARNDLI SDLLSEAKLRLSRIVEDP 
EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 
PEYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGNQRIKVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 


1409 


KAASRDS P C LE FCP LCG VS SHDLQHRMWYHRIjSHLHS RLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANT VMRFD YVWLRDH CR S AS C YNS KTHQRS LDTAS VDLC I KP 
KTIRIiDETTLFFTWPDGHVTKYDLNWLVKNSYEGQKQKVIQPRI 
LWNAEIYQQAQVPSVDCQSFLETNEGLKKFLQNFLLYGIAFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL 
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SEQ 

ii> 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted end j 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, B« 
(ixutamic Acid; r-fnenyiaiamnc, b-uiyume, 
H=Histidine, I=Isoleucine, K=Lysine, ! 
^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, i 

O^ocil lie / J. — 1 il-t- t:\JIl J. He / v = Valii ic f 

W=*Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEEFELIiSKSAI \KHEYIEDVGECHQPHDWDWAQS* ISTHG j 
/YKELYLIRYNNYDRAVINTVPYDWHRWYTAHRTLTIELRRPE 
NEFWVKLKPGRVLFIDNWRVLHGRECFTGYRQLCGCYLTRDDVI, 
NTARLLGLQA 


5901 


1 


2121 


VAIEQTSLKMMQAVGGAPARPTGEYICNQCGAKYTSLDSFQTHIi 
KTHLDTVLP KLTCPQ CNKE FPNQESLLKHVTI H FMITSTYYICE 
SCDKQFTSVDDLQKHLLDMHTFVFFRCTLCQEVFDSKVSIQLHL 
\ AVKHSNEKKVYRCTS CNWDFRNETDLQLHVKHNHLENftGKVHK 
C I FCGES FGTEVELQCHITTHS KKYNCKFCS KAFHAI ILLEKHL 
REKHCVFETKTPNCGTNGASEQVQKEEVELQTLLTNSQESHNSH 
DGSEEDVDTSEPMYGCDICGAAYTMETIiLQNHQLRDHNIRPGES 
AI VKKKAELI KGNYKCNVCSRTFFSENGLREHMQTHLGPVKHYM 
C P I OGERF P S LLTLTEHKVTHS KSLDTGNCR I CKMP LQS EEE FL 
EHCQMHPDLRNSLTGFRCWCMQTVT^ I r HMUiN. 
GS AVQTTGRGQHVQKLY KCAS CL KEFR S KQDLVKLD INGLP YGL 
CAGCVNLSKSASPGINVPPGTNRPGLGQNENLSAIEGKGBCVGGL 
KTRCS*IATFKF*VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 

EGLNHECKLCSQTFDS PAKLQCHL I EHS FEGMGGTFKCP VCFTV 
FVQi^nCLQQHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 

HSS 


5902 


712 


209 


LKNRRRSRP S I RQS I GSTS VS RWLTS L FTY LDHTAD VQ * V* RE F 
I PLKPRQ * ED * MFQS WLHAWGDTLEEAFEQCAMAMFG YMTDTGT 
VE P LQTVEVE TQGDDLQSLLFH FLDE WL YKF S ADE F F I P \GWGE 
EFSLSKHPQGTEVKAITYSAMQVYNEENPEVFVIIDI 


5903 


2106 

* 

* 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLIiPGDPQPLQGRGLPTT 
PALFALSAVPGGAAS PMP PSG LRLL P LLL P LLWLL VLTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPEAVLiAIjiNS TRDR VAGE S AE P E r E PEAD 1 1 AlUi V 1 KVJjFi vni 
HNE I YDKFKQSTHS I YMFFNTSELREAVPE PVLLSRAELRLLRL 
KLKVEQHVEL YQKYSNNS WRYLSNRLLAPSDS PEWLSFDVTGW 
RQWLSRGGE I EG FRLS AHC S CDS RDNTLQ VD I NGFTTGR \ RGDL 
ATIHGMNRP FLLLMATPLERAQHLQS \ SRHRQAL\DTNY\ CFSF 
HGGRNCLRC / VHC *HL I FRKDL\GW \ KW I \ HE \ P KGYHANF C\L 
GPCPYIWSLDTQYSKVLALYNQ\HKPG\ASAAP\CCVPQALEP\ 
LPIVYY\VGRKPKVEQIiSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAINTFKEEQRLIYEELIKEEKTTNNELSAISRKIDTW 

AI/jNSETEKAFKAlboKVPVDKV 1 Po ILiPEEVIjUc c»J\.r liy y J. ov? 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKES IQI WKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS I EMSMKCASQL 
KEEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADEISRFQERDLHKLELKILDRQAKEDEKSQ 
KQRRLAKLKEKVENNVSRDP SRLY / NTHQRLGRTNQKDRTNRIjW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MASFPPRVNEKEIVRLRTIGELLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WDVYTGKLLLNLVDHTGWRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNWVY\ 
SCAFSPDSSMLCSVGASKAWAAILV*LRLCWHHSHTGATMVLS 
WAERVASLATGLGATFT IG * SNLAFVLQGVLYVHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH*LL»SKLI FQFYGIGSLTSETNLM* S I 

KKKPKRIALLQEERLS *DKPPSSHLI * QTEVNIRILFRAILHS * 

LLIFRI * NCI * TYS * I IDP FYIQMTYDRG* FGKNKMVKF * F I EM 
*LYYFHKIAFSFCNW*HPCCLPKKFHLAVNILFACSICFSS*A 
QVGDPSLL*TSDYLKGRCQWSNNLIiTLRFLSVYFFKNLWSGKK 
REGGL* YLTLFI S VYFS * LVFGINGFQ YS F WKLHCLYFMFRLI 
FKLTFNRNI *NRI CMS AL I NL KTD FNLTMTLS I F F KLLI IYNA* 
YNLN*I+QF*YKMCHFVLCMSE*SYNICLFIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine f D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 

1 H=Histidine# I = Isoleucine. K=Iivsine 
L=Leucine, M^Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan , Y=Tvrosine, X= Unknown *sStor> 
Codon, /=possible nucleotide deletion, 

[ \=possible nucleotide insertion) 




K 




IRKLEGHHHDWACDFS PDGALLATAS YDTRVYI WDPHNGD I LM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWR I DEDYPVQVAPLSNGLCCAFS TDGSVLAAGTHDGS VYFWAT 
PRQVPSIjQHLCRMS IRRVMPTQEVQELPI PSKLLEFLS YRI 


5906 


146 


2038 


REGAGSGRMASGA\YNPYIEI I EQPRQRGMRFRYKCEGRSAGS I 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DLVGKDCRD\GYYEAEFGQE\RRP\LFFQN\LGIRCVKKKEVKE 
A\IITR\IKAGINPFDVP*KQLNDIEDCDLDWRLWFRVFLPDG 
HGNL\ TTALPPV\ VSS P I YDNRAPNTAELR VCR VNKNCGS VRGG 
DE I FLLCDKVQKDD I EVRFVLNDWEAKGI FSQADVHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 

KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
OS AG I TVNFPERPRPCLIjG Q TGEGR YFKTfR'DMT .PQUnawu D 

TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLS S FSTRTUPSNS QG I P P FLRI PVGNDLNASNAC I YNN 
ADDIVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPRLLSMNLENPSCNSVLDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 

* 


1873 


TYLLSSWSS * *NLDTKIKSQVKV/RKGHKKIS WPYPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
1 RQKRRTIESYCQDVLRRQEEFEHKEEVLQEIiNMFPQIiDDEATRK 
AYYKEFRKWEYSDVILEVLDARDPLGCRCFQMEEAVLRAQGNK 
KLVLVLNKIDLVPKEVVEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCSVPVDQASESLLKSKACFGABNLMRVLGNYCRLGEVRTHIR 

VGWGLPNVGKSSLINSLKRSRACSVGAVPGITKFMQEVYLDKF 
IRLLDAPGIVPGPNSEVGTILRNrVHVOKT.ZVnPVTDinrTTT rtPP 

NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWVSGKI SFYI PPPATHTLPTHLSAE I VKEMTEVFD IEDT 
EQANEDTMECLATGESDE LLGDTDPLEMEI KLLHS PMTKI ADAI 
ENKTxVYKIGDLTGYCTNPNRHQKGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPLQl^QALASALKNKXKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 | 

t- 1 


975 | 


HCGIKKRGEGSGSPSPASGGFQLGCQIPEPSLPSEEETHPHTRA 
HTRTLRATIjTRRPPRSHSTRLRFPMPLDGDGGLASWK/PMRER* 
GWRRPAKAAGASLGVAATGKRGCRMSKRYLQKATKGKLLIIIFI 
VTL WGKWSS ANHH KAHHVKTGTCE WALHRCCNKNKI E ERSQT 

VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWSCS SGNKVKTTRVTH 


5909 


1 


5002 j 


PAIPGSTIIWAPGSHSAARADGRHGSLPSQSQAPGALCGARAPP ) 

S SNLRADRS M I CAQARAGKNL YHNRFLGLAAMAFPS RNS QS LRR 

CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 

STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 

ENFLDYKNRGVNGSHRGQ 1 1 WKI DASS YFVEPETKI CFKYYHGV 

SGALRATTPSVTVKNSAAPI FKS IGADETVQGQGSRRLI SFSLS 

DFQAMGLKKGMFFNPDP YLKI S IQPGKHS I FPALPHHGQERRSK 

IIGNTVNPIWQAEQFSFVSLPTDVLEIEVXDKFAKSRPIIKRFL 

GKLSMPVQRLLERHAIGDRVVSYTLGRRLPTDHVSGQLQFRFEI 

TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 

SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 

VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 

DGEAPASTKEEPLEEEATTQSRAGREEEEKEQEEEGDVSTLEQG 

EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 

IHTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 

AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 

HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 

SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 

EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 

SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 

IDEPLPPNWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 

RSGSIQQMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence * 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide * 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon r /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGSDSEAESSQSSLDLRREGSLSPVNSQKITIiLLQSPAVKFI 
TNPEFFTVLHANYSAYRVFTSSTCLKHMILKVRRDARNFERYQH 
NRDLVNFINMFADTRLELPRGWE I KTDQQGKS FFVDHNSRATTF 
IDPRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASL 
LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 
RQ P S LARNHTLRE KI H Y I RTEGNHGLEKLSCDADLVI LLSLFEE 
EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLBYSGPSREFFFLLSQELFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 
MKDNNITDILDLTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 
EYIERMVKWRVERGWQQTEALVRGFYEVVDSRLVSVFDARELE 
L V I AGTAE IDLNDWRNNTB YRGG YHDGHLVIRWFWAAVER FNNE 
QRLRLLQFVTGTS SVPYEGFAAPPWEPMGLRRFLP * KKWGKITS 
LPPRG \HTCLQPDWDLPTVS PRTPMLYEK\LLTA\ VEETSTFGT 


5910 


1526 


446 


VAE FAAME PGRTQ I KLDPRYTADLLEVLKTNYG I PSACF SQPPT 
AAQLLRALGPVELALTSILTLLALGS IAIFLEDAVYLYKNTLCP 
I KRRTIiLWKSSAPTWSVLCCFGLWI PRSLVLVEMTITSFYAVC 
FYLLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ\R*CWALSNTPS * R* R*PWWACFSS PTASMTQQTFL 
RGAQLYGS TLSSA/ CSTLLALWTIiGI I SRQARLHLGEQNMGAKF 
ALFQVLL r LTALQPS I FSVLANGGQ I ACS PPYS S KTRSQVMNCH 
LLILETFLMTVLTRMYYRRKDHKVGYETFS SPDLDLNIiKALRWM 
AWTMKGCCTH 


5911 


109 


595 


QLPLAPCIQGKGLEMRSPKPQSFIIRSSHSGAGLIjVKNPSTPVF 
CGHRRGG AAFKY KP TP WG PEQRPTGQKHMRGGVS LLS PRLE CS 
GTISAHCNLRLPSSSNSPAPAS*LAGITGVCHHAQLIFVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


M ILNKALMLGALALTTVMS PCGGEDI VADHVAS YGVNLYQS YGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNLNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDN I F P PWN I TW LSNGHS VTEGVS ETRPS SPKSDHFLLQDQ 
VTS PSFPFE * * DL * TAKVEQLGAWFEPLLKHWGAB I PTTL 


5913 

• 


46 

* 


1198 

• 


QLRMAGAEGAAGRQ S ELE PWS LVDVLEE DEE LENEACAVLGGS 
DSEKCSYSQGSVKRQALYACSTCTPEGEEPAGICLACSYECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEMVCQACMKRCS FLWAYAAQLAVTKI ST\GMMDWCGTLM 
E* /DDQEVIKPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAG S S S E SDLQTVF KNE SLNAE S KSG CKLQELKAKQLI KKDTAT 
YWPLNWRS KLCTCQDCMKMYGDLDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLSSMNRVQQVELIC/GIQ* FED 


5914 


960 


124 


NLGGS ELP PEEALFIQVASMNQRRVDFYLAS I EDMLVAI /GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDVWEERRPMTTARGWHSMCSLGDS 
I YS IGGSDDN I ESMERFDVLGVEAYS PQCNQWTRVAPLLHANSE 
SGVAVWEGRIYILGGYSWENTAFSKTVQVYDREADKWSRGVDLP 
KA I AGGS ACF I AP * SLGQRTRKRKAKARGTRTGASDP S CAS WDH 
PHRHLPGLCRPAATS 


5915 


1604 


I 703 


F PGR PTRPLKLGRRRKRAR I IQAPHCHSPRPRTCPPGALQAP EA 
PASRAEGPVAWVNGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 

PQ EDF PALGG P CP PRM P P S PGr o AV V JjIjIsaj 1 If If v r tr sfXsu V V tf ± o 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGE FPEGL * +AAGPAAH 


. 5916 


256 


633 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
TVTGAVHRHLNHVAGI I PWVLHSQLKPTAATAQDQWTSQQYPDH 
PTRLILQ*NQATADKNN*TTALLQPHQRL\VSPRMAEA 


5917 


1343 


827 


AHQILTYLEP 1 ICLWNYNKILTVFLTKSVLEI *KFIHTPQTYR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre tip una. iriy 
to first 
amino acid 

•1- CD^UUu \-/JU 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E=i 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
n=Hisc lctine , x=xsoieucine , K^Lysane, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
o-ocriiic; i-inreoniiicj v=vaj.zne, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








F *NDFFGI KEVYVSRRLRKTS F/ RLAVTFLEQAWSKECVPVDQ 
FMEHLLPSLLSLASDPVPNVRVLLAKALRQMLLEKAYFRNAGNP 
HLEVIEETILALQSDRDQDVSFFAALEPKRRNIIDTAVLBKQN 


5918 

• 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
APPTAAAGSMMKKDALTLSLSEQVAAALKPAPAPASYPPA\ADG 
APS AAP P DGLiLAS PDLGLL KLAS PELERL 1 1 QSNGL VTTTP TSS 
QFLYPKVAASEEQEFAEGFVKALEDLHKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGEIAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\VAPAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PAS RKLGAQSRALERES EDPS * S PEHGS LASTASLLREQ VAQLK 
QKVLSHVNSGCQtiLPQHQVPAY 


5919 


1 

* 


4254 


TSVQGDSQGTPTSSQGSINMEHWISQAIHGSTTSTTSSSSTQSG 
GSGAAHRLADVMAQTHI ENHSAPPDVTTYTSEHS I QVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
PPSLEAALQRWGT1SPKAPCLTTMDTNGKPLYILTYGKLWTRSM 
KVAYS I LHKLGTKQE PMVRPGDRVALVFPNNDPAAFMAAFYGCL 
LAE WPVP I E VPLTRKDAGSQQ I GFLLGS CGVTVALTSDACHKG 
LPKS PTGE I PQFKGWPKLLWFVTES KHLS KPPRDWF\ PH I KDAN 
NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQAIjTQACGYTEAE 
TIVNVLDFKKDVGLWHGILTSVmMMHVISIPYSLMKVNPLSWI 
QKVCQYKAKVACVKS RDMHWALVAHRDQRD INLSS LRML I VADG 
ANPWS ISS CDAFLNVFQS KGLRQEVI CPCASS PEALTVAIRRPT 
DDSNQPPGRGVLSMHGLTYGVI RVDS EEKLS VLTVQDVGLVMPG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FEVFAMTS SGAPI SE YP F IRTGLLG FVGPGGLVFWGKMDGLMV 
VSGRRHNADD I VATALAVEPMKFVYRGRIAVFS VTVLKDERIVI 
VAEQRPDSTEEDSFQWMSRVLQAIDSIHQVGVYCLALVPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PEIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARKFLFLSE 
VLQWRAQTTPDHI LYTLLNCRGAIANSLTCVQLHKRAEKIAVML 
MERGHLQDGDHVALVYPPGI DL IAAFYGCLYAGCVP ITVRP PHP 
QNIATTLPTVKMIVEVSRSACLMTTQLICKLLRSREAAAAVDVR 
TWPLILDTOD*PKKRPAQICKPCNPDTLAYLDFSVSTTGMIiAGV 
KMSHAATSAFCRS I KLQCELYPSREVAI CLDP YCGLGFVLWCLC 
SVYSGHQS ILI PPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGSQTESLKARGLDLSRVRTCVVyAEERPRIALTQSFSKL 
r J^lAjJjHt'KAvSTSF GCKVNLiAJCCLQGTSGPDPTTVYVDMRAIiR 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPIiGDSH 
LGEIWVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 
TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 

IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 

TKTlA/T»T?FHYTiTVr!\nA7\nfnT^l\7TDT'MC"D/* , I?W , nDMlIT Drv^CT ann 
xiv v v ucjCiu iiij, vuvvvv v i/xvjr v xjr liNoKvjlliJN.yKl\LrlijKIAjr loAJjy 

LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAGVSRI PP*LFPPLHPTFLSLWCLHHKLP /HPPGASM 
VRP PWPRRPPAH I SSVRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQE PAVHI PGQEPLTASM 
LAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLESPESLHAKIDEAVAVLQAHQAMEQPKAYMH 


5921 


727 


157 


VCPGTGGE*GLWGQLGGLPKETPLKPMDAFTGSGLKRKFDDVDV 
GS S VSNSDDE IS S SDS ADS CDS LNPPTTAS FTPTS I LKRQKQLR 
RKNVR FDQVTVYYFARRQG FTS VPSQGGSS LGMAQRHNS VRS YT 
LCEFAOEOEVNHREILREHLKEEKLHAKKMKLTKNGTVR«;VFAD 
GLTLDDVSDEDIDVENVEVDDYFFLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAG I KCQVD 
RMS FPCGCSRDGCGNMAGR I E FNP I RVRTHYLHT I MKLELES KR 

Q\GAAQQPQ\*GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHLI ILRVIENRGAEGKRK 


5922 


2475 


495 


S YSNWGLFPS VFIQVPRSRTGNLKPIFLFYS YYE \ CMETLKG \T 
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SEQ 
ID 
NO: 


Predicted 1 
beginning 
nucleotide j 
location 
corresponding 
to first 1 
amino acid \ 
residue of j 
amino acid 
sequence | 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide I 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=" j 
Glutamic Acid, r= Phenylalanine, *j=vjiycine, i 
H=Histidine, I=Isoleucine, K=*Lysine, j 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, | 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) j 








CLYNATQ YKVCS PRNDRPDACYNPSEPAATTVFE IRTGLLLGDT j 
SK 1 1 TRTEEKE I PKQI TLRFDACAAINS KKLE IGCGSLN * ERS * 
RVENKYVCHESGVCKNCAYWPCVI * AT * KKNKNDSVYLQKGEAN 
PS CAAGH CNPLELI I TNPLDPHWKKGERVTLG INRTGLKPQ WI 
L I KGEVHKCS PKPVFQTF YEELNLP APELLKKTKNLFLQLAENV 
I FLLNGTS CYVRGGTT IGDRWPWEA*ELVPTDPAPDI I P I * KAB 
ASNF*VLKTSIIRQYCIAREGKDFIIPVGKPNCIGQKLYNSTTK j 
TIT* * DLNHTEKNPFSKFSKIiKTA*AHAESH* DW 1 V^bLylix * it, [ 
RHRAYFRL PNKWADSCV I GTI KPS F FLLP I KMGELLG FS VYASR 
EKKGIVIGNWKDNEWPRERIIQYYGPATWAQDGSWGYR/TP/VY j 
MLNWI IRLQAILEI ISNETGRALTVLAWQETQMRNAI YQNRIAL 
DYLLVAEGGVCRKFNJjTNCCIjQ INDy uv" V AJN 1 v KUiTi 1 JMjrtnv f 
I Q VWHKFD PE S LFGKWF PAI GGFKTL I VGVLL V I RTCLLL P CVL 
PLLFQMIKGIVATLVHQKTSAHVNYMNHYRSISQRDSKSEDESE j 

NSH ' j 


5923 


137 ! 


638 

m 


QJjCGRRGQRFRTS I KRMHPI * RTCPNTNL/ 1 1 UUSQENTQ I RDL 1 
QQENRELWISLEEHQDALELIMSKYRKQMLQLMVAKKAVDAEPV 
LKAHQSHSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKIiA 
QLELENKELRELLS I SS ESLQARKENSMDTASQAI K | 


5924 


274 

• 


2146 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLNSLTPPTSVRRM 
PLITTVTIiLKMVARHHMKLLCSKAFSTQLQQKIFLHSQMGIHHQ 
SVCMKLKPNTSHI ISILMGQPMALVQLETIAPLTI I IQKFQTQD 
HMKFWKNLPIiHSHHIiTPS VPQTVI PKKTGSPEI KLKI TKTIQNG 
RELFE SS LCGDLLNE VQ AS E \ Q *NQS I ES RKEKRKKSNKHDS S R 
SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
ANT I F CS NNGSVHW \ FKFQVGDLVW S KVGT Y PWW PCMVSS DP Qli 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 1 
LLAEATKQASNHS E KQKI RKPRPQRERAQWD IG IAHAEKALKMT | 
REERIEQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 

KTAAARKSLPAS ITMHKGSLDLQKCNMS PWKIEQVFALQNATG 
DGKFIDQFVYSTKGIGNKTEISVRGQDRLIISTPNQRNEKPTQS 
VSS PEATSGSTGS VEKKQQRRS I RTRS E S E KSTEWPKKKI KKE 
QVGFLHVES 1 


5925 


216 


1911 


MMTAE S REATGLS P QAAQE KDG I VI VKVE EEDEEDHMWGQDS TL j 
QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQIIjELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTIjLE 
DLELDLSGQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT ! 
QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 

ADS QAMVK I EDMAV S L I LEE WGCQNIiARRN KUiNKy U.N x va ts Air 
PQGGENRNENEESTSKAETSEDSASRGBTTGRSQKEFGEKRDQE 

GKTGERQQKNPEEK/l RK£KKDoCjr^AloJwlvK.i 1 iwoKor xtoiwyiv. | 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSIilRHK 

I IHTGEKP YECSECGKAF\SLNS \NLVLHQRI \HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
R IHTGEKP YECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK j 
\AFTRSSTLTLHHRIHARERASEYSPASLDAFGAFLKSCV 


5926 


2 


| 233 


DRCLMLKQGSQPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAHEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS j 


5927 


4146 

. _ 


1248 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA j 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRYNTGEERRKISEEAAR 
KRRLEFIBKEKKQKDQI I SLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MOOOD A PnNT? 21 XWTTR V T Vd R HT ■ PERO KGOLAVE RAKQVE E FLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRP S S SRGGKP RNKE E EV 
YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV j 
AKGVKSSDVSPPLGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRIiNENLKAQEDEKGKQN 
LS DT FE I NVHE DAKEHEKE KSVSSDRKKWE AGGQI»VI P LDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVLKILGEAE | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, K= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
SsSerine, T~Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQLQTELLENTTIRSE IS PEGEKYKPLI TGEKKVQCISHEINPS 
AI VDS PVETKS PEFS EAS PQMS LKLEGNLEE PDDLETE I LQBPS 
GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDLS KLFRTLMDVPTVGDVRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEEliRLHLEQEMGFEKFFEVYEKlKAIHE 
DEDENI E I CS KI VQNI LGNEHQHL YAKILHLVMADGAYQEDNDB 


5928 

• 


4146 


! 1248 


KHFS KFGSQALYQLKRPASGQNS I S VMPAQKI TKPAAKYG I PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKI SEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKRE I YGRGLPERQKGQLAVERAKQVEE FLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRPS S SRGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGEXKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKS SDVS PPLGQHETGGSPSKQQMRSVI S VTSALKEVGVDS 
SLTDTRETSEEMQKTNNAI SSKRE I LRRLNENLKAQEDEKGKQN 
LSDTFE INVHEDAKEHEKEKSVSSDRKKWEAGGQLVI PLDELTL 

DTS FSTTERHTVGEVIKLGPNGSPRRAWGKS PTDS VLKI LGEAE 
LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AXVDS PVETKS PE FSEAS PQMS LKLEGNLEE PDDLETE ILQE PS 
GTNKDE \ SLPCTI TDVWISEEKETKETQSADR ITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKVVHSE 
HLNLVPQVQS VQCS PEES FAFRSHSHLP PKNKNKNS LLIGLS TG 
LFDANNPK3VILRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYSKIKAIHE 
DEDENIE ICSKI VQNT LGNEHQHLYAKI LHLVMADGAYQEDNDE : 


5929 : 


3 | 


1558 


LDFSMTTQLPAYVAI LLFYVSRASCQDTFTAAVYEHAAILPNAT 
LTPVSREEALALMNRNLDILEGAITSAADQGAHI IVTPEDAIYG 
WNFNRDSLYPYLEDI PDPEVNW I PCNNRNRFGQTPVQERLS CL\ 
AKNNSIYWANIGDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVPKEPEI VTFNTTFGS FGI FTCFDI 
LFHD PAVTLVKD FHVDTI VFPTAWMNVL PHLS AVE FHS AWAMGM 
RVNFIASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHSAWNWTS YASS I EALSSGNKE FKGTVFFDEFTFVK 
LTGVAGNYTVCQKDLCCHLSYKMSENI PNEVYALGAFDGLHTVE 
GRYYLQICTLLKCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
YVFPEVLLS ENQLAPGE FQVSTDGRLFS LKPTSGPVLTVTLFGR 
LYEKDWASNASSGLTAQARI IMLIVIAPIVCSLSW 


5930 


113 1 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK 
KLVWI PS ERHGFEAAS I KEERGDE VMVE LAENGKKAMVNKDDIQ 
KMNPPKFSKVEDMAELTCLNEAS VLHNLKDRYYSGLI YTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS ILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
I PGE\LERQLLQANP ILESFGNARTVQNDNSSRFGKFIRINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAILTPRIKVGRDYVQKAQTKEQADFAVEALA 
KATYERLFR WLVHR INKALDRTKRQGAS F IG I LDI AGFE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCI DL I ERPANPPGVLALLDEE CWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEG I RI CRQG FPNR I VFQ E FRQR YE I LTP 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5931 



113 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6082 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine t I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T=*Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



NAIPKGFMDGKQACERMIRALELDPNLYRIGQSKIFFRAGVLAH 
LEEERDLKITDI 1 1 FFQAVCRG YLARKAFAKKQQQLSALKVLQR 
NOUVYLKLRHWQWWRVFTKVKPLX^VTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKN I LAEQLQAETELFAEAEEM 
RARLAAKKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 
DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILIiLEDQNSKF 
IKEKKLMEDRIAECSSQLAEEBEKAKNLAKIRNKQEVMISDLEE 
RLKKEEKTRQELE KAKRKLDGETTDLQDQ I AELQAQIDELKLQL 
AKKE E ELQGALARGDDETLHKNNAL KWRELQAQ IAELQEDFES 
EKASRNKAE KQKRDLS EELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
QELHAKVSEGDRLRVELAEKASKLQNELDNVSTLLEEAEKKGIK 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 
KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ \KKFDQLLAEEKS I S ARYAEERDRAEAEARE 
KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLL I KQVRELEAELEDERKQ 
RALAVASKKKME IDLKDLEAQIEAANKARDEVI KQLRKLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDELADE ITNSASGKSALLDEKRRLEARIAQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKliQELEGAVKSKFKATISALEAKIGQLEEQLE 
QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 

TSDVNETQPPQSE 

RGNCFW I VP FTMAQRTGLE DP ER YLF VDRAVI YNPATQAD WTAK 
KLWIPSERHGFEAASIKEERGDEVMVELAENGKKAMVNKDDIQ 
KMNPPKFS KVEDMAE LTCLNEAS VLHNLKDRYYSGL I YTY S GL F 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKKVI Q YLAHVAS SHKGRKDHN 
I PGE\LERQLLQANPI LESFGNARTVQNDNSSRFGKFIRINFDV 
TGY I VGANI ETYLLEKSRAVRQAXDERTFH I FYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGY I P I PGQ\QDKGNFRGDPGE AMHIMG 
FSHEEILSMLKWSSVLQFGNI S FKKERNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAILTPRIKVGRDYVQKAQTKEQADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGECVDYKADEWLMKNMDPLND 
NVATLXiHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTIOMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
NAIPKGFMDGKQACERMIRALBLDPNLYRIGQSKIFFRAGVLAH 
LEEERDLKITDI IIFFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELEEIIiHDLESRVBEEEERNQILQNEKKKMQAHIQ 
DLEEQLDEEEGARQKLQLEKVTAEAKI KKMEEE I LLLEDQNSKF 
IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 
RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKIiQL 
AKKEEELQGALARGDDETLHKNNALKVVRELQAQ IAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QELHAKVS EGDRLRVELAE KAS KLQNELDNVS TLL EEAE KKG I K 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=»Methionine, N=Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKIOjLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKAIjSIiARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 

knvhelekskraleqqv\eemrtqleeledelqatbdaklrlev 
nmqamkaqferdlqtrdeqnbekkrlli kqvreleaelederkq 
ralavaskkkmeidlkdleaqieaankardevikqlrklqaqmk 
dyqreleearasrdeifaqskesekklksleaeilqlqeelass 
erarrhaeqerdeladeitnsasgksalldekrrleariaqlee 
eleeeqsnmellndrfrkttlqvdtlnaelaaersaaqksdnar 
qqlerqnkelkaklqelegavkskfkatisaleakigqi,eeqle 
qeakeraaanklvrrtekklke i fmqvederrhadq ykeqme ka 
narm kql krqle eae e eatranas rrklqrelddateaneglsr 
evstlknrlrrggp i sfsssrsgrrqlhlegaslelsdddtesk 
tsdvnetqppqse 


5932 


33 


572 


RHLEE I CFLFLQ KGRKLKLSG PR W EEG KPRGTGGLW VKAEANMG 
FGATLAVGLTIFVLSWTIIICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYP P PYPAQPMGP PAYHETLAGGAAAP YPASQPP YNPAYMDA 
PKAAIi 


5933 


1 

m 

W 


3190 

■ 


GTR KL KMADKT PGGSQKAS S KTR S SDVHS S GS S DAHMDAS G PSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS IG KMSTAKRTLS KKEQEELKKKEDEKAAAE I YEEFLAAFEG 
SDGNKVKTFVRGGVVNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIBTKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERBERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERAIiKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKVVIPTERNLLALI 
HRMIEFWREGPMFEAMIMNREINKPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPS KKGALKEEQRDKLEE I LRGLTPRKNDIGDAMVFC 
LNNAEAAEE I VDCITES LS I LKTPLP KKIARLYLVSDVLYNS SA 
FCVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIBEKETEEVPDDLD 
GAP I EEELDGAPLEDVDGI PIDATP I DDLDG VP IKS LDDDLDGV 
PLDATEDS KKNEPI FKVAPSKWEAVDES ELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
S KFS KYS EMSEEKRAKLRE IELKVMKFQDELESGKRPKKPGQS F 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 




1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DD YAPGS HD VGD P S TT \N F YLGN I \NPQMNLKKCCCQE FGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGIiPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEE IVDCITESLS I LKTPLPKKIARLYLVSDVLYNSS A 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWAI YPEPFLIKLQNI FLGLVNI IEEKETEDVPDDLD 
GAP I EEELDGAPLEDVDGI PIDATPIDDLDG VP IKS LDDDLDGV 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
locat ion 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=» Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDSKKNEP I FKVAPSKWEAVDESELEAQAVTTS KWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHIjYSNPIKEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DE CT PTRKERKRRHSTS PS PSRS S SGRRVKSPS PKSERS ERS ER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5935 

* 


3 

* 


4493 

• 


S YWL SGWRLSRP PRQFW AGWRG I GRFGTMAP VHGDD CE I G ASAL 

S DSG S FVS SRARRE KKS KKGRQEALERLKKAKAGERYKYE VEDF 

TGVYEEVDEEQYS KLVQARQDDDWI VDDDG IGYVEDGRE I FDDD 

LEDDAIJ)ADEKGKDGKARNKDKRNVKKIjAVTKPNNIKSMFIACA 

GKKTADKAVDLS KDGLliGDIIiQDLNTETPQ ITPPPVMILKKKRS 

IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAG 

DDVQVESTEEEQESGAMEFEDGDFDEPMBVEEVDLEPMAAKAWD 

KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 

VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 

GKVWIESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGT 

PISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 

E YLEVKYSAEM PQLPQDLKGETFSHVFGTNTS SIiELFLMNRKI K 

GPCWLEVKKSTALNQPVSWCKVEAMALKPDLVNVIKDVSPPPLV 

VMAFSMKTMQNAK^QNEIIAMAALVHHSFALDKAAPKPPFQSH 

FCWSKPKDCI FPYAFKEVIEKKNVKVEVAATERTLLGFFLiAKV 

HKI DPD 1 1 VGHN I YGFELEVLLQR INVCKAPHWS KIGRLKRSNM 

PKLGGRSGFGERNATCGRMICDVEISAKELIRCKSYHLSELVQQ 

ILKTERWIPMENI(3NMYSESSQLLYIiLEHTWKDA\KFILQIMC 

ELNVLPLALQ I TNIAGN IMSRTLMGGRSERNBFLLLHAFYENNY 

I VPDKQ IFRKPQQKLGDEDEE IDGDTNKYKKGRKKGAYAGGLVL 

DPKVGFYDKFILLLDFNSLYPSIIQEFNICFTTVQRVASEAQKV 

TEDGEQEQIPEIiPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 

IjNPDLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 

YKGRE I LMHTKEMVQKMNLE VI YGDTDS IMINTNSTNLEEVFKL 

GNKVKSEVNKLYKLLEIDIDGVFKSLLLLKKKKYAALVVEPTSD 

GNYVTKQELKGLDI VRRDWCDLAKDTGNFVIGQ ILSDQSRDT IV 

EN I Q KRL I E IGENVLNGS VPVSQ FE INKALTKD PQD Y PDKKS LP 

HVHVALW I NS QGGRKVKAGDTVS YV I CQDGSNLTAS QRAYAP EQ 

LQKQDNLTIDTQYYIiAQQIHPVVARICEPIDGIDAVLiIATGWEL 

\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 

TCGTEN I YDNVFDGSGTDME P SLYRCSN IDCKAS PLT FTVQLSN 

KLIMDIRRFIKKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 

ACMKATLQPEYSDKSLYTQLCFYRYIFDAECALEKLTTDHEKDK 

LKKQ F FT P KVLQD YRKLKNTAEQFLS RS GYSEVNLS KLFAGCAV 

KS 


5936 

* 


1124 


139 

• 


RGEE Q F DAE FRR F AC LGFGERLQE F S RLLRAVKKS KAW TC Y .LAI < 
RMLMATCCPS PTTTACTG P WQRAPPLRLLVQKREADS SGLAFAS 
NSLQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PE THRR VRLHKHG S DR PLGFY IRDGMS VR VAPQG \ LERVPG I FI 
S RLVRGGLAE STGLLAVS DE I LE VNG I EVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTSLLKSTVQLMCRLLQDKRYQCVYSLAEIFKVLASFYVILVIL 
YGLTS S YSLWWMLRS S LKQ YS FEALRE KSN YS DI PDVKNDFAF I 
LHLADQYDPLYSKRFSIF LSEVShiNKIjKyX-NJjJNJNiaw l v&jsijjt\o.K. 
LVKNAQDKI ELHLFMLNGLPDNVFELTEMEVLSLEL I PEVKLPS 
AVSQLVNLKELRVYHSSLWDHPALAFLEENLKILRLKFTEMGK 
IPRWVFHLKNLKELYLSGCVLPEQLSTMQLEGFQDLKNLRTLYL 
KS SLSRI PQVVTDLLPSLQKLSLDNEGSKLVVLNNLKKMVNLKS 
LELISCDLERI PHS IFSLNNLHELDLRENNLKTVEEIIS FQHLQ 
NLS CLKLWHNNI AY I P AQ IGALSNLEQLS LDHNN I ENLP LQLFL 
CTKLHYLDLSYNHLTFIPEEIQYL\SNLQYFAVTNNNIEMLPDG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 

| location 

j corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V« Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLtiLGKNSLMNLSPHVGELSNIjTHREPIG\NYLETI» 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


5938 


395 


| 1865 


YKGEGFFCNQEARGERRKKKKAMS SPNI WSTGSS VYSTPVFSQK 
MTVWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPBCVPEGDV 
TV I LNNLLEG YDN KLR P D IGVKP TL I HTDMYVNS IG P VNAINME 
YT IDI FFAQTWYDRRLKFNST I KVLRLNSNMVGKIWI PDTFFRN 
SKKADAHWITTPNRMLRIWNDGRVLYSLRLTIDAECQLQLHNFP 
MDEHSCPIiEFSSYGYPREBIVYQWKRSSVEVGDTRSWRLYQFSF 
VGLRNTTE WKTTSGD YWMSVYFDLSRRMGYFTIQT YI PCTLI 
WLSWVSFWINKDAVPARTSLGITTVLTMTTLSTIARKSLPKVS 
YVTAMDL FVS VCF I FVFS ALVE YG \TLHY FVS NRKPS KDKDKKK 
KNPAPTIDIRPRSATIQMNNATHLQERDEEYGYECLDGKDCASF 
F CC FEDCRTGAWRHGR I H I RI AKMDS YAR I FFPTAFCLFNIiVYW 
VSYLYL 


5939 


66 


1404 


I RPGYL KEVQENS PGHRAGLE PF FDF I VS I NGS RLNKDNDTLKD 
LLKANVEKPVKMLI YSS KTLELRETSVTPSNLWGGQGLLGVS IR 
FCS FDGANENVWHVLEVESNSPAALAGLRPHSDYI IGADTVMNE 
SEDLFS L I ETHEAKPLKL YVYNTDTDNCREV 1 1 TPNSAWGGEGS 
LGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTE 
VQLSSVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVLSTGV 
PTVP \LLP PQVNQSLTS VPPMES S YLHLPGLMPFTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS S LTVD VTP PTAKAP TTVEDRVGDS TPVS E KP VSAA 
VD ANAS ESP 


5940 


145 


717 


RRSASRS AS PRQS AGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLP VHMGLV I TE VEQE PS FSD IAS LWWCMAVGI S YI S VYDH 
QG I FKRNNSRLMDEILKQQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHLAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA \ VYLVQMWL I L I 


5941 

* 


13 


6147 . 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSLLAVWLLALPVA 

WGQCNAPEW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

SI ICLKNSVWTGAKDRCRRKSCRNPPDPVNGMVHVI KGIQFGSQ 

IKYSCTKGYRLIGSSSATCXISGDTVIWDNETPICDRIPCGLPP 

TITNGDFISTNRENFHYGSVVTYRCNPGSGGRKVFEiiVGEPSIY 

CTSNDDQVGI WSGPAPQC 1 1 PNKCTP PNVENGI LVSDNRSLFSL 

NEWEFRCQPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 

VLHAERTQRDKDNFS PGQEVFYSCEPG YDLRGAASMRCTPQGDW 

S PAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 

QLKGS SAS YCVLAGMESLWNS S VPVCEQI FCPSPPVIPNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASDFPIGTSLKYE 

CRPEYYGRPFSITCLDNLVWSSPKDVCKRKSC3CTPPDPVNGMVH 

VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKCTPPNVENGIL 

VSDNRSLFSLNEWEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CSRVCQPPPDVLHAERTQRDKDNFSPGQEVFYSCBPGYDLRGAA 

SMRCTPQGDWSPAAPTCEVKS CDDFMGQLLNGRVLFPVNLQLGA 

KVDFVCDEGFQLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSP 

PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTI 

RCTSDPQGNGVWSS PAPRCG I LGHCQAPDHFLFAKLKTQTNASD 

FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

TAHWSTKPP ICQRI PCGLPPT I ANGDF I S TNRENFH YGS WTYR 

CNLGSRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 

T P PNVENG I L VSDNRS L FS LNE WE FRCQ PG FVMKG P RRVKCQA 

LNKWEPELPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 

E PG YDLRGAAS LHCTPQGDWS PEAPRCAVKS CDDFLGQLPHGRV 

LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWNNSVP 

VCEHI FCPNPPAILNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 
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SEQ 
ID 

Mrt ■ 
INU I 


Predicted 
beginning 
nuci eot lac 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

iucduiun 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenvlalanine . Gs=Glvcine. 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGEST IRCTSDPHGNGVWS S PAPRCELSVRAGHCKTPEQP 
PFASPTIPINDFEFPVGTSLNYECRPGYFGKMFSISCLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRIi 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSNN 
RTSFHNGTVVTYQCHTGPDGEQLFELVGERSIYCTSKDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGSHTVQCQTNGRWGPKLPHCSRVCQPPPEILHGEHTIiSHQ 
DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
(Tin TTT XiCYX . P WGP VT J i PT »MT .DT/^KVS 5VCDEG FRLKGRS AS HCV 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VS YTCD PHPDRGMT FNL I GES T IRRTSEPHGNGVWSS PAPRCEL 
PVGAACPHP PKI QNGHYIGGHVSLYLPGMT I S YTCDPGYLLVGK 
GFIFCTDQGIWSQLDHYCKEVNCSFPLF^GISKELEMIOanraY 
GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 
VGTLSGTI FFI LLI IFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


688 


YL YVRMRANPLAYG I SHKAYQ I DP P L \ R KHREQ \ Ij VI E \ VGRKL 
DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 
HKTEGDIFAIVSKAEEFDQIKVREEEIEELDTLLSNFCELSTPG 
G VENS YGKINI LLQTYINRGEMDS FSL I SDSAYVAQNAAR I VRA 
LFEIALRKRWPTMTYRLLNLSKAIDKRLWGWASPLRQFSILPPH 
MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 
S VMMEAFIQP I TRTVLRVTLS I YAD FTWNDQVHG TVGEPWW I WV 
EDPTNDH I YHSEYFLALKKQVI SKEAQLLVFTI P I FEPLP SQYY 
I RAVSDRWLGAEAVCI INFQHLILPERHPPHTELLDLQPLP ITA 
LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGKT 
VAAELAI FRVFNKYPTSKAVY I APLKALVRERMDDWKVRI EEKL 
GKKVIELTGDVTPDMKSIAKADLIVTTPEKWDGVSRSWQNRNYV 
QQVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 
LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 
HYCPRMASMNKPAFQAIRSHS PAKPVL I FVSSRRQTRLTALEL I 
AFLATEEDPKQWLNMDEREMEKf I I ATVRDSNLKLTLAFG IGMHH 
AGLHERDRKTVEELFVNCKVQ VL I ATS TLAWGVNFPAHLVI I KG 
TEYYDGKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 
KKD FYKKFL YE P FP VE S SLLGVLS DHLNAE I AGGT I TS KQDALD 
Y I TWTYF FRRLIMNPS Y YNLGDVS HDS VNKFLSHL I EKS L I ELE 
LSYCIEIGEDNRSIEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 
STEELLSIIiSDAEEYTDLPVRHNEDHMNSEIAKCLPIESNPHSF 

DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 
ELHAAKT KQAWNFLSHLPE INVG I SVKGS WDDLVEGHNE LS VS T 
LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PRFP KS KDEGWFL I LGE VDKREL IALKRVG YI RNHHVAS LS FYT 
PE I PGRY I YTLYFMSDCYLGLDQQYD/NLSQRYTSES FCTGQHQ 
GL 


5943 


1 


2274 

• 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQTWLPNHWFLRLR 
EGLKNQS PTEAEKPASS SLPSS P PPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHVVLLTSDNVIRIYSLR 
E PQTPTNVI I LSEAEEESLVLNKGRAYTASLGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLYI LYENGETFLTYI SLLHSPGN/ 1 
WKAVGS IAHAS \ AAEDNYG YDACAVL CLP CVPNI LVIATESGML 
YHCWLEGE E EDDHTS E KS WDS R I DL I PS L YVFECVELELALKL 
AS GE DDP FDS DFS CP VKLHRDP KC P S R YHCTHE AGVHS VGLTW I 
H KLH KFLGS D EEDKDS LQE LSTEQKCFVEH I LCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR \ 
EDVEVAESPLRVLAETPDSFEKHIRSILQRSVANPAFLKASEKD 

I AP P PEE CLQLLS RATQ VFREQ Y I LKQDLAKEE I QRRVKLLCDQ 
KKKQLEDLSYCREERKSLREMAERLADKYEEAKEKQEDIMNRMK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid. FssPhenvl alanine G=Glvcine 
H=Histidine , I=Isol eucine , K=Lysine , 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glut amine, R=Arciinine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown / *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHSFHSELPVLSDSERDMKKELQLIPDQLRHLGNAIKQVTMK 
KDYQQQKMEKVLSLPKPTIILSAYQRKCIQSILKEEGEHIREMV 
KQINDIRNHVNF 


5944 


167 

* 


3428 


FS I ATFTDEPEVLTEPPSATTTTT IG I SATWTTLAGSHGKRNNT 
ITTTSS KRKNRKNKITPENVQ I IFDDPLPIS YS QPEKVNGES KS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPSPLSSPNGKLTVASPKRGQKREEGWKEVVRRSKKVSVPSTVI 
SRVIGRGGCNINAIREFTGAHIDIDKQKDKTGDRIITIRGGTES 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMGIKMTTOALSSTSQTATALTVPAISSASTHKTIKNP 
VN\NVRPGFPVSFP\LAYPPPQFAHALIiAAQTFQQIRPPRLPMT 
HFGGTFPPAQS TWGPFPVRPLS PARATNS PKPHMVPRHSNQNSS 
GSQVNS AGSLTS S PTTTTS SSASTVPGTSTNGS PS S PSVRRQLF 
VTVVKTSNATTTTVTTTASNNNTAPTNAT YPMPTAKEHYP VS S P 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSOSTPESMLiSGKSSYIjPNSDPT.HO^nT^KAPR 

FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSLIK 
MVSSSTENNGPQTVWTGPWAPHMNSVHMNQLG 


5945 


1461 

* * 


197 


GVTHLFLFGKRKLRNGIAEDLKGQADFFFLLVS EAWATGS PRA ; 

WLTCLILPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
OPOPEKPESTLDGAAARAFYRZHjTGDEc; QAPDQDP QOTTrpaPPP 

KRK^Q^IMKAPAAEAVAEGASGPJIGQGRSLEAEDKMTHRILRAA 
QEGDLPBLRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESH 
GETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGP 
QPPNLPLGVP ISSPGFKTirJiRGGWEPGMGLGPRGEGRANPI PTV 
LKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRE \TP PRVATLSW 
REERRREE\KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGS YSS IQPEEYS \SWC \EWLQDLLA\ YVS PK\HS YLRDLP 
S EGS PQRVNS I DFV\ EL\ EHLO PD VLVHAVLRWDF / T I LTEAV 
YSYRGQKQKKVMLTVEQAQDQHYALVIjWGPGAAW\YPQLQRKKG 
YIWEFKYIiFVQCNYTLENLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPSFVKISDLATHLEDKCSGWLIKAQISELAFPITASQ 
KIALNAHSSLKSIFSSLPNIVYTGCAKCGLELETDENRIYKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVIVPSSEITYGMWADLFHSLLAVSAEPCVLKIQSLFVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


5947 


3 


1317 


RGI PDRRRRGP I GR VNMDLENKVKKMGLGKEQG FGAPCLKCKEK 
CEG FELH F WRK I CRNC \ NVAKKSM / T VLLSNE E DRKVGKLFEDT 
KYTTLI AKLKS DG I PMYKRNVM I LTNP VAAKKNVS INTVTYEWA 
P PVQNQALARQ YMQMLP KEKQPVAGSEGAQYRKKQLAKQLPAHD 
QDPSKCHELSPREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
D PAI YAERAG YDKLWHPAC F VCS TCHELLVDM I YF WKNE KL YCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKPVCKP CYVKNHAWCQGCHNAI DPEVQRVTYN 
NFS WHASTECFLCSCCS KCL IGQKFMPVEGMVFCS VECKKRMS 


5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIEIEGRLHRISIFDPLEIILEDDLTAQEMSECNSNKENSERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
poProline, Q=Glutamine, R=Arginine, 
S=Ser ine , T=Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S P PSAPRRP P VYY KF I E KS AE E LDNEVE YDMDEED YAWLE I VNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCIOIDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQS RARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E\VGFANTVFIEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTP PGCTRRP LN I YGDVEMKNGVCRKE S S VKTVRS TS KVR 
KKAKKAKKALAE PCAVLPTVCAPYI PPQRLNRI ANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLI ELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMD FATMRKRLEAQG YKNLHE FEEDFDLII DNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAP KCGRGKPALVRRHTIjEDRS EL I S CI ENGNYAKAAR I AAEV 
GQSSMWISTDAAAS VLEPLKWWAKCSGYPS YPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKS KMVPLG IDETIDKLKMMEGRNSS IRKAVR I AFDRAMNHL 
SRVHGEPTSDLSDID 


5949 


39 

■ 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGR CHRGS AARHP S S P CS VKHS P TRETLTYAQAQRM 
VEIEIEGRLHRISIFDPLE I ILEDDLTAQEMSECNSNKENSERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
SPPS APRRPPVYYKFIEKS AEELDNEVEYDMDEEDYAWLE I VNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGECQNSNVILFCDMCNLAVHQECYGVP YI PEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
E \ VGFANTVFIEP IDGVRNI PPARWKLT\ CNLCKEKGR/VGAC I 
QCHKANCYTAFHVTCAQKAGLYMKMBPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRS VLDQLQD KD PAR I FAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLBPANRAHIjGLEEQLRELLDMLDLTCAMKSSG 

• 

SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAP KCGRGKP ALVRRHTLEDRS EL I S C I ENGNYAKAAR I AAEV 
GQ S S MW I S TDAAAS VLE PLKWWAKCSG YPS YPAL 1 1 DP KMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 

SRVHGEPTSDLSDID 


5950 


1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSR 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 


5951 


143 


5449 


WNVKPSLLWQLFKFSDKEEHEQNDSISGKTGETGVEEMIATRK 
VEQDS KETVKLSHEDDHI LEDAGSSD I S SDAACTNPNKTENS LV 
GLPS CVDEVTECNLELKDTMGI ADKTENTLERNKI EPLGYCEDA 
ESNRQLESTEFNKSNLEWDTSTFGPESNILENAICDVPDQNSK 
QLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 
VIHSKQNMTTDAPKKIVAAKYEVIHS KTKVNVKSVKRNTDVPES 
QQNFHRPVKVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine', D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, ' 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MoMethionine, N»Asparagine, i 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, ,*=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




- ■•■ * 
■ t 




DQTLVQIFKPLTHSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK 

QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKLKLKKPEKN 

LQPRQRRSSKSFSLDEPPLFIPDNIATIRREGSDHSSSFESKYM 

WTPS KQCGFCKKPHGNRFMVGCGRCDDWFHGDCVGLS LS QAQQM 

GEEDKEYVCVKCCAEEDKKTE I LDPDTLENQATVE FHSGDKTME 

CEKLGLSKHTTNDRTKY IDDTVKHKVKI LKRESGEGRNS SDCRD 

NE I KKWQLAPLRKMGQPVLPRRSSEEKSEKI PKESTTVTCTGEK 

AS KPGTHEKQEMKKKKV\EKGVLNVHPAAS AS KPS ADQ I RQSVR 

HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKELFSFFRDTDAK 

YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEELAS 

KELAAWRRRENRHTIEMIEKEQREVERRPITKITHKGEIEIESD 

APMKEQEAAMEIQEPAANKSLEKPEGSEK\RKEEVDSMSKDTTS 

QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKVWGVARKHSDNE 

AESIADALSSTSNILASEFFEEEKQESPKSTFSPAPRPEMPGTV 

EVESTFLARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 

PDSIQVGGRISPQTVWDYVEKIKASGTKEICVVRFTPVTEEDQI 

S YTLLFAYFSSRKR YGVAANNMKQVKDM YLI PLGATDKI PHPLV 

P FDGPGLELHRPNLLLGIj I IRQKLKRQHSACASTSH I AETPESA 

PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLKKQRNKPQQ 

NLQEDLPTAVEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLEL 

ANKPLP VDD I LQS LLGTTGQVYDQ\ AQSVMEQNTVKE I P FLNEQ 

TNS KI E KTDNVE VTDGENKE I KVKVDN I S ESTD KSAE I E TS WG 

SSSISAGSLTSLSEjRGKPPDVSTEAFLTNLSIQSKQEETVESKE i 

KTIiKRQLQEDQENNLQDNQTSNSS PCRSNVGKGNIDGNVS CSEN 

LVANTARSPQFINIjKRDPRQMGRSQPVTTSESKDGDSCRNGEK 

HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 

PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 

VGNTCPSEFPSKS ITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 

SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 

VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 

RRHSDP WGRQDQQQLDRPFNRGKGDRQRFYS DSHHLKRERHEKE 

WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 

GKASRDS RNVDKKPDKP KS EDYE KDKERE KS KHREGE KDRDR YH 

KDRDHTDRTKSKR 


5952 


3226 

* 

• 

• 

• 


63 9 

• 


PPARRSARDI»PRALSMEAARPSGSWNGALCRLL\LVTL\AFLIF 
ASDACKNVTLHVPSKLDAEKLVGRVNLKECFTAANLIHSSDPDF 
QILEDGSVYTTNTILLSSEKRSFTILIjSNTENQEKKKIFVFLEH 
QTKVLKKRHTKEKVTjRRAKRRWAPI pcsmlenslgpfplflqqv 
QSDTAQNYTI YYS IRGPGVDQEPRNLFYVERDTGNLYCTRPVDR 
EQYES FE I IAFATTPDGYTPELPLPL 1 1 KIEDENDNYP I FTE ET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVP PS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLKIKVQDMDGQYFGL 
QTTSTCI INIDDVNDHLPTFTRTSYVTSVEENTVDVEILRVTVE 
DKDLVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCWKPL 
NYEEKQQMILQIGWNEAPFSREAS PRSAMSTATVTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
GRTCTGTLGI ILQDVNDNS PFI PKKTVI ICKPTMSSAEIVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GSYWPrrVRDRLGMSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAILLGIALFFCILFTLVCGASGTSKQPKVIPDD 
LAQQNLIVSNTEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 
I KNGGQETI EMVKGGHQTSES CRGAGHHHTLDS CRGGHTE VDNC 
RYTYSEWHSFTQPRLGEESIRGHTLIKN 


5953 


330 


Bll 


PLLCNPDPGWYWWVKQESEISKESQEMDARPKLDLGFKEGQTIK 
LCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTI PPPSS / V 
KLPSTNHVTPPSIPKSNHGGSDADIIjLDLDSPAPVTTPAPTPVS 
VSNDLWGDFSTAS S S VPNQAPQPSNWVQF 


5954 


32 


2130 


PPPPPPKLANMADLEAVIiADVSYIjMAMEKSKATPAARASiailVIi 
PEPSIRSVMQKYLAERNEITFDKIFNQKIGFLLFKDFCLNEINE 
AVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartlc Acid, E= 
Glutamic Acid. F=Phenylalanine, G=Glycine, i 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine # M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V» Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFS KQAVEHVQSHLS KKQ VTSTLFQ PY I BE I CESLRGD I FQKFM 
ESDKFTRFCQWKNVELNIHLTMNEFSVHRI IGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKRIKMKQGETLALNERIMLSLVSTGDCPFI | 
VCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYA 
TEI ILGLEHMHNRFVVYRDLKPANILLDEHGHARIS\DLGLACD 
FSKKJCPHASVGTHGYMAPETVIjOKGTAYDSSADWFSIjGCMLFKLL 
RGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPELKSLLEGIiL 1 
QRDVS KRLGCHGGGSQEVKEHSFFKGVDWQHVYIjQKYPPPIi ipp 
RGE VNAADAFD I GS FDEEDTKGI KLLDCDQELYKNFPLVI S ERW 
QQEVTETVYEAVNADTDKIEARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVXjQCESDPEFVQWKKELNE 
TFKEAQRLLRRAPKFLNKPRSGTVELPKPSLCHRNSNGIj 1 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR j 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
WKRCINIWRDVGIiFGVLNEIANSEEEVFEWVKTASGWALALCR 
WAS SLHGS L FPHLSLRS EDL IAEFAQVTNWS S CCLRVFAWHPHT \ 
NKFAVALLDDS VRVYNAS ST I VPSLKHRLQRNVAS LAWKPLS AS j 
VLAVACQSCILIWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSLA 
WAPSGGRLLSASPVDAA1RVWDVSTETCVPLPWFRGGGVTNLLW 
SPDGSKILATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWSPD | 
GSRLLFTVLGEPIiIYSLSFPERCGEGKG\ALEVQSQQRLWQICL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL | 


5955 1 


1705 


139 


GVGVRGARAMATVQ EKAAALNLSALHS PAHRP PG FS VAQ KP FGA 
TYVWS S I INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVI FSHL 
IQNKYFGDVDI PRAKVVRVCQALMDYKVFEAVPTKVFGKDKKPT 
FEDSSCSLYRFTTIPNQDSQLGKENKLYSPARYADALFKSSDIR 
i QasledLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRIj 

lqlvdlplldsllkqqeavpkipqpkrqstmvnssnyldrgilk I 

AYSDS QEDEWLS AAIDCS E YLPDQMVVE ISRS FPEQPDRTDL VK 
E LLFDAI GRYYS S REPLLNHLS DVHNG I AELLVNGKTEIAIiEAT 
QLLLKLLDFQNREEFRRLLYFMAVAANP SE FKLQKE S DNRMWK I 
RIFSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS\VK\LMAIQNGRDPNRDAGYIYCQRIDQRDYSNNTEKTTKDE j 
1 LIjNLLKTLDEDSKLSAKEKKK\LLGQFYKCHPDI FIEHFGD j 


f r" r* 

1 5957 


1479 


i A ^1 


FT .OVAVAMDTLDRVVKPKTKRAKRFIiEKREPKLNENI KNAML I K *j 
GGNANATVTKVTjKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 

skksdcslfmfgshnkkrpnwlvtgrmydyhvldmielgienfv 
slkdiknskcpegtkpmlifagddfdvtedyrrlksllidffrg 
ptvsniriagleyvlhetalngkiyfrsytclllkksgcrtprie 
leemgpsldlvxrrthlasddlyklsmkmpkalkpkkkknishd 
tfgttygrihmqkqdlsklqtrkm\kglkkrpaeritedhekks 
1 krikkklmelsqpllfhcviilkri ikhqsiqsfl 


5958 


1 


1 3138 


AAALGMIiLWFPACQAFNIiDVEKLTVYSGPKGSYFGYAVDFHIPD j 
ARTASVLVGAPKANTSQPDIVEGGAVYYCPWPAEGSAQCRQIPF 

DTTNNRKI RVNGTKE P I EFKSNQWFG\ ATVKA\HKGKS CGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGYCQAGFSLDFYKNGDLIVGGPGSFYWQGQVITASVADIIA 
NYSFKDILRKLAGEKQTEVAPASYDDSYLGYSVAAGEFTGDSQQ | 
ELVAGI PRGAQNFGYVS I INSYDMTFIQNFTGEQMAS YFGYTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYLQVSSLIi 

FRDPQ I LTGTETFGRFG S AMAHLG DLNQDG YND IA I G VP FAGKD 
QRGKVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SD I DKNDYPDL I VGAFGTGKVAVYRARPWTVDAQLLLHPM I IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQSIANTIVLMAEVQLD 
SLKQKGAIKRTLFLDNHQAHRVFPLVIKRQKSHQCQDFIVYLRD j 
ETEFRDKLSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
QAHILVDCGEDNLCVPDLKLSARPDKHQVI IGDENHLMLI INAR j 
NEGEGAYEAELFVMIPEEADYVGIERNNKGFRPLSCEYKMENVT 
RM WCDLGNPMVSGTNYSLGLRFAVPRLEKTNMS INFDLQ I RS S 
NKDNPDSNFVSLQINITAVAQVEIRGVSHPPQIVLPIHNWEPEE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I^Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Gliitamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEBVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
LYIFHIQTLGPLQCQPNPNINPQDIKPAASPEDTPELSAFLRNS 
T I PKLVRKRDVHWEFHRQSPAKI LNCTN I ECLQ I SCAVGRLEG 
GESAVLKVRSRLWAHTFLQRKNDPYALASLVSFEVKKMPYTDQP 
AKLPEGS I AIKTSVI WATPNVS FS I PLWVI ILAILLGLLVLAIL 
TLALWKCGFFDRARPPQEDMTDREQLTNDKTPEA 


5959 


1 


1166 


GTSGYAAQQLPSLLKEREFHLGTLNKVFASQWLNHRQWCGTKC 
NTLFWDVQTSQITKI P I LKDREPGGVTQQGCGIHAI ELNP'SRT 
LLATGGDNPNSLAIYRLPTLDPVCVGDDGHKDWIFSIAWISDTM 
AVSGS RDGS MGLWE VTDDVLTKS DARHNVS RVPVYAH I THKALK 
D I PKEDTNPDNCKVRALAFNNKNKELGAVS LDG YFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFIi 
EERLSACTGSKPRIAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 

■ 


870 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKE 
LDWASINGFCEQLNEDFEGPPLATRLLAHKIQSPQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKI LELLYS WTVGLPEEVKI AEAYQMLKKQG\ I VKSDPKLPDDT 
TFPLPPPRPKNVIFEDEEKSKMIARLLKSSHPEDLRAANKLIKE 
MVQEDQKRMEKISKRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EALAEILQA 
NDNLTQVINLYKQLVRGEEVNGDATAGS I PGSTSALLDLSGLDL 
PPAGTT YPAM PTRPGEQAS PEQPS AS VSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNSFQSSDATEPPAPALAQAPSMESRPPAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRIIiFHFARDPLPGRSDVLVWVSM 
LSTAPQPIRNIVFQSAVPKVMKVKLQPPSGTELPAFNPIVHPSA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SG E PR P EPGNMATCIGE K I E DFKVGNLLG KGS FAGVYRAE S I HT 
GLBVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYIiKNRVKPFSENEARHFMHQI IT 
GMLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGLATQLKMPHE 
KHYTLCGTPNY I S PEIATRS AHGLE SDVWS LGCMFYTLL IGRP P 
FDTDWKNTLNKVVLADYEMPTFIjSIEAKDLIHQLLRRNPADRL 
SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTS SSSGS FERPDNNQALSNHLCPGKTP FPFADPTPQTE 
TVQQWFGNLQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDASDNAHS VKQQNTMKYMTALHS KPE I IQQECVF 
GSDPLSEQSKTRGMEPPWGYQNRTIiRSITSPLVAHRLKPIRQKT 
KKAWSILDSEEVCVELVBCEYASQEYVKEVLQISSDGNTITIYY 
PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
SRFVQLVRSKSPKITYFTRYAKCILMENSPGADFEVWFYDGVKI 
HKTEDFIQVTEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
I CLALES I ISEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSNYPTRDRAS FNRMVMHS AAS PTQAP I LNPS MVTNEGLGLTT 
TASGTD I SSNSLKDCLPKS AQLLKSVFVKNVGWATQ\LTSGAVW 
VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENEKLPDYIKQ 
KLQCLSS ILLMFSNPTPNFH 






Z44 / 


R V CSS S ASTAS QAVMADAW E E I RRLAAD FQRAQ FAEATQRLS ER 

NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 

GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQIjIDEN 

YLDRLAEEVNDKLQESGQVTISELCKTYDLPGNFIjTQALTQRLG 

RIISGHIDLDNRGVIFTEAFVARHKARIRGLFSAITRPTAVNSL 

ISKYGFQEQLLYSVLEELVNSGRLRGTWGGRQDKAVFVPDIYS 

RTQ STWVDS F FRQNG YLEFDALS R LG I PDAVS Y I KKRYKTTQLL 
7 — ■ ■ 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 

n~nlSLluin5r i-iauicuciuc , JX— Jjy sine , 

L=Leucine, M=Methionine, N=Asparagine , | 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=»possible nucleotide insertion) 








flkaacvgqglvdqveasveeaissgtwvdiapiilptslsveda 
aillqqvmrafskqastwfsdtvwsekfX indctelfrelmh 
qkaekemknnpvhliteedlkqistlesvstskkdkkderrrka 
tegsgsi^gggggnareykikkvkkkgrkdddsddesqsshtgk 
kkpeisfmfqdeiedflrkhiqdapeefiselaeylikplnkty 

T t?TnrDCTriTMC OT"*POTi CPT i r , DTTDT'T TTFiTi^P'PVQTyTT.VKrMTPT iPRT^n 
JjfiVVKbvrWoo X 1 o/ibb i. \jt<J\t\ X ± jSJJLt^jZiCi V olN Ij XIhIN XlvJjr dlWJ 

MKFFADDTQAALTKHLLKS VCTD ITNLI FNFLASDLMMAVDDPA 
AITSEIRKKILSKLSEETKVALTKLHNSLNEKSIEDFISCLDSA 
AEAa5IMVKRGDKKRERQrLBX3HRQALAEQLKOTEDPALILHLT 
SVLLFQFSTHSMLHAPGRCVPQI XAFLNSKI PEDQHALLVKYQG 
LVVKQIiVSQSKKTGQGDYPLNNEIiDKEQEDVASTTRKELQELSS 
S I KDLVLKSRKSSVTEE 


5963 


62 


1130 

* 


P WNPQDFPGNRGLMG \QKGE IG P P \GQQGKKGAPGMP \GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 

AKGERGEKGEPGVRGAIGS KGESGVDGLMGPAGP KGQPGDPGPQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQLPVLLQSGRIRNCDH 
CLSQHGSPGIPGPPGPIGPEGPRGLPGLPGRDGVPGLVGVPGRP 
GWGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGDPGLPGKDGDHGKPG I QGQPGP PG I CDPSLCFS VIARRDP F 
RKGPNY 


5964 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGHNR 
KKHLNYTE FTQFLQELQLEHARQAFALKDKSKSGMI SGLDFSD I 
MVT I RSHMLTP FVEENLVSAAGGSI SHQVSFS YFNAFNS LLNNM 
ELVRK I YSTLAGTRKDAEVTKEEFAQSAIRYGQATPLE IDI LYQ 
IiADLYNASGRLTLADIERIAPLAEGALPYNLAELQRQQSPGLGR 
P I WLQ IAESAYRFTLGSVAGAVGATAVYP IDLVKTRMQNQRGSG 
SWGELMYKNSFDCFKKVLRYEGFFGLYRGLIPQLIGVAPEKAI 
KIjTVNDr VKDKr TKKLajo VFJjVAE»VJj/i*jvjV-/\i3LaoUV xr iiNf u&x 
VKIRLQVAGE ITTG PRVSALNVLRDLG I FGLYKGAKACFLRD I P 
F S A I Y F P V YAH C KL LLADE NGH VGG LNL LAAGAMAG \ VP AAS LV 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSS PQFG \ VTLVTYELLQRGFYIDFGGLKPAGSEPTPK 
SRIADLPPANPDHIGGYRLATATFAGIENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVTWL YRFLP TSNMAAKLRS LL P PDLRLQFWLHARLQKCFLSRG 
CGSYCAGAKASPLPGKMAMGLMCGRRELLRLLQSGRRVHSVAGP 
SQWLGKPLTTRLLFPAAPCCCRPHYLFLAASGPRSLSTSAISFA 
EVQVQAP P WAATPS PTAVP EVASGETAD WQTAAEQS FAELGL 
GS YTPVGLIQNLLE FMHVDLGLPWWGAIAACTVFARCL I FPL1 V 

YQKKHG I KL YKPL I LP VTQAP I F I S FF I ALREMANL PVPSIiQTG 
GLWWFQDLTVSDPIYILPLAVTATMWAVLELGAETGVQSSDLQW 
MRNVIRMMPLITLPITMHFPTAVFMYWLSSNLFSLVQVSCLRIP 
AVRTVLKI PQRWHDLDKLPPREGFLESFKKGWKNAEMTRQLRE 
REQRMRNQLELAARGPLRQTFTHNPLLQPGKDNPPNI PSS \SSS 
SSKPKSKYPWHDTLG 


5966 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIEIIRNQKQIANIDRITK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGDEIDWETENHDWYCFECHliPGEVLICDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 
RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVN1HRLHVKRSMGWKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
S VS TQT KKLS ASS PRMLHRSTQTTNDGVCQS M CHDKYTKI FNDF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
BMDRKCKQVKEKCKEE FVEE I KKLATQHKQLI SQTKKKQWC YNC 
EEEAMYHCCWNTS YCS IKCQQEHWHAEHKRTCRRKR 


5967 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIE I IRNQKQIANIDRITK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 
RAIDLKKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVI FYGADSE QAD IARML YKDTCHEL \ DELQLC 

KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAW I P SENI QDI TVNI HRLHVKRSMGWKKA 
CDE LE LHQRFLREGRFWKS KNEDRGEEEAE S S IS STS NEQL KVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETERVWEALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLI SQTKKKQWCYNC 
EEEAMYHCCWNTSYCSIKCQQEHWHAEHKRTCRRKR 


5968 


81 


| 1288 


VRFPRRGGAFPTVLTPGRQQGVFLGPQRPGSEPDIPARGQPHPP 
RPVGVSTSAQAQVQPPAMHRRRLALGL^FCLI^TSI^VLWVYIj 
ENWLPVSYVPYYLPCPEIFNMKIJfYKREKPLQPVVWSQYPQPKL 
LEMRPTQLLTLTPWLAPIVSEGTFNPELLQHIYQPLNLTIGVTV 
FAVGN/HFLESAEE FFMRG YRVHYYI FTDNPAAVPGVPLGPHRL 
LSS I PIQGHSHWEETSMRRMETISQHI AKRAHREVDYLFCLDVD 
MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 
DSEGDFYYGGAVFGGQVARVYEFTRGCHMAILADKANGIMAAWR 
EESHLNRHFI SNKPS KVLS P E YLWDDR KPQP P S L KL I RFS TLD K 
DISCLRS 


5969 


1126 


503 


DVGFNIKRKRCDLDVFLESPRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIWLFGISITGGLFYTI 
FKELFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHTOFTEYVKIX3LKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GSGE YDFRY I FVEI ES YPRRTI I I EDNRSQDD 


5970 


316 f 


4712 


SQDNIGHRLIiQKHGWKLGQGLGKSIiQGRTDPIPIVVKYDVMGMG 

RMEMELDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 

KALEDLRANF YCELCDKQYQKHQE FDNH INS YDHAHKQRLKDLK 

QREFARNVSSRSRKDEKKQEKALRRLHELAEQRKQAECAPGSGP 

MFKPTTVAVDE EGG EDDKDES ATNSGTGATAS CGLGS E FS TDKG 

GPFTAVQITNTTGLAQAPGLASQG ISFGI KNNLGTPLQKLGVS F 

SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 

GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 

PEYYHYIPPAHCBCVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 

KKGS S PKP KS C I KAAAS QGAEKTVS E VS EQP KETSMTE PS EPGS 

KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLATPA 

GKESQEGPKHPTCPFFPVLSKDESTALQWPSELLIFTKAEPSIS 

YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGIjDPGE 

PNKSKEVGGBKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 

TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 

SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 

KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 

DAS S DQS CYSRQRS YSDDS YS DYS DRSRRHS KRSHDSDDSD YAS 

SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 

XRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 

DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 

EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 

LGNKPVLPLIGKLPATRKPNKKCEESGLERGEEQEQSETEEGPP 

GSSDALFGHQFP\SEETTGPLLDPPPEESKSGEVTADHPVAPLG 

PPAHFDCYLGDPT1SHNYLPDPSDGNTLESLDSSSQPGPVESSL 

LPIAPDLEHFPSYAPPSGDPSIESTDGAEDA\SLAPLESQPITF 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=» phenylalanine, G-Glycine, 
H=>Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



SEQ 
ID 
NO: 



Predicted "™ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



TPEEMEKYSKLQQAAQQHIQQQIiliAKQVKAFPASAALAPATPAL 
QP IHIQQPATASATS ITTVQHAILQHHAAAAAAAIG IHPHPHPQ 
PIiAQVHHI PQPHLTP I SLSHLTHS 1 1 PGHPATFLASHP IHI I PA 
SAIHPGPPTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 



5971 



53 



2149 



S FL YFVGVDMDNP I GNWDGRFDGVQLCS FACVE ST I LliHIND 1 1 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSELFYTI*NGSSVDSQPQSKSKNTWYIDEVAEDPAKSLTEISTD 
FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQSVMEELNTAPVQESPPLAMPPGNSHGLEVGSLAEVKENP 
PFYGVIRWIGQPPGLNEVLAGLELEDECAG\CTDGTF/REGTRY 
FTCALKKALFVKLKS CRPDSRFASLQPVSNQI ERCNSLAI WEAY 
LSEWEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQELIiRTEIVNPliRIYG 
WCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHILRV 
EPLLKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNIiKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPDISAGKIKQFCKT(^NTQVHLHP 
KRliNHKYNPVSLPKDLPDWDWRHGCI PCQNMELFAVLCI ETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYL 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTMSIjYK 



5972 



440 



1761 



ILLAGSPSPRDQCSQRQSSGGDKELVTRGCTFSTAWSPSAMTQ 
EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRC7AGSCI 
PSAIVSFTVSRRNANVI PNFQILFVSTFAVTTTCL1WFGCKLVL 
NPSAININFNLILLLLLELLMAATVI IAARSSEEDCKKKKGSMS 
DSANILDEVPFPARVLKSYS WBVIAGISAVLGGI IALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLIiLV 
LLLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 
GQEPPEGVRQGESLESRRGANGPVTPRRGNRVAAPSIjAPGMETH 
NP 



5973 



65 



2007 



NGDGKDLFGH I WAWRSNG 1 1 SNFRRS PHAGMAEDE PDAKS P KTG 
GRAP PGGAE AGE PTTLLQRLRGT I SKAVQNKVEG I LQD VQKF S D 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKI IRE IFPDI 
KARRLGGRGQS KYCYSGIRRKTLVSMPPLPGLDLKGSES PEMGP 
EVTPAPRDELVEAACALTCDWAERI LKRS FSS I VEVARFLLQQH 
h I SARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQPPKDLEARTGAGPbARGBRKKSWESSAPGANNLQV 
NAIiVARLPLIaLPRAPRSLIPPIPVSPPILAPRLSSGALKVATLP 
LS S RAG AP PAAVP I INMILPTVPALPGPGPGPGRAP PGGLTQPR 
GTENREVGIGGDQGPHDKGVKRTAEVPVS EASGQAPPAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVLAQG\QGDGTVSK 
GGRGPGSQHTKEAEDKIPLVPSXVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 



LGLQMHTTSGR IHQAM VTS LNE DNES VT VE W I ENGDTKG K\E I D 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYEIMCMIRDFRGSLDYRPLTTADPIDEHRIC 
VCVRKRPLNKKETQMKDLDVIT I PS KDWMVHE PKQKVDLTRYL 
ENQTFRFDYAFDDS APNEMVYRFTARPLVET I FERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFEIYSGKVFDIjLNRKTKLRVLEDGKQQVQVVGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLI DLAGNERGADT S S ADRQTRLEGAE INKS LLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CENTLNTLRYANRVKELTVD P TAAGDVRP I MHH P PNQ I \DD 
LETQWGVGS S P QRDDL KLLCEQNE EE VS PQL FTFHEAVSQMVEM 



5974 



4293 



2200 



416 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=« Phenyl alanine, G=Glycine, ! 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *=:Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EBQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE ~" 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQINPKRPRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID 
LESIFSI^P\DIi\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
T V\AS I KNDPPS \ RDNR WGSARARPSQFPEQFS SAQQNGS V\ S 
DISPVQAAKKEFGPPSRRKSNCVKEVBKLQEKREKRRLQQQELR 
E KRAQDVDATNPNYE I MCM I RD FRGSLD YRPLTTAD P I DEHR I C 
VCVRKRPLNKKETQMKDLDVITI PSKDWMVHEPKQKVDLTRYL 
ENQTFRFDYAFDDSAPNEMVyRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGPCNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFE IYSGKVFDLLNRKTKLRVLEDGKQQVQWGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHG KFS L I DLAGNERGADTSS ADRQTRLEGAE INKS LLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CENTLNTLRYANRVKELTVDPTAAGDVRP I MHHPPNQI \ DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQVVEDHRAVFQES IRWLEDEKALLEMTEEVDYDVDS YATQLE 
AILEQKIDILTELRDKVKSFRAAIiQEEBQASKQINPKRPRAL 


5976 


20 


2949 

i 


VHHLHLTRVS VWNLD 1 1 LRI AQQMGIKTLNLVLG \ LKRA\ LBF 

PEVSWMEVKDPNMKGAJMLTNTGKYAIPTIDA\EAYAIGKKEKPP 

FLPEEPS SSSEEDDPI PDELLCLI CKDIMTDAWIPCCGNS YCD 

ECIRTALLESDEHTCPTCHQNDVSPDALIAtoFLRQAVNNFKNE 

TGYTKRLRKQLPSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 

MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 

ATVSISVHSEKSDGPFRDSDNKILPAAAIiASEHSKGTSSIAITA 

LMEEKGYQVPVLGTPSLLGQSLLHGQLXPTTGPVRINTARPGGG 

RPGWEHSNKLGYLVSPPQQIRRGERSCYRSINRGRHHSERSQRT 

QGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 

GQP \ P P AGYS VP PPGFPPAPANLSTP WVS SGVQTAHSNT I PTTQ 

APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 

KERRRS FSRSKS PYSGSS YSRS S YTYSKSRSGSTRSRS YSRS FS 

RSHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 

RYHSRSRSPQAFRGQSPNKRNVPQGETEREYFNRYREVPPPYDM 

KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYXGYAAGAQPR 

PSANRENFS PERFLPLNIRNS PFTRGRREDYVGGQSHRS RNIGS 

NYPEKLSARDGHWQKDNTK5KEKESEWAPGIX3KGNKHKKHRKRR 

KGEESEGFLNPELLETSRKSREPTGVEENKTDSIiFVLPSRDDAT 

PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 

SKKENIVKPAKGPQEKVDG\DVRDLLDLNL\QLKKPKEETPKDL 

TILNHHLPLRRMKKSL\ EP P\ EKLTLNQQK\TPRWKTSQRGKSE 

EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHSINHILHPGAGVAAGPATGW/REYLT 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAI IEEDDGDGGW 
DTYHNTGITGITEAVKEITLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGLLETDEATIiDTRKIVEACKAKTDAGGEDAILQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTIENHPHLPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHM YLL I FLKFVQAVI PT I E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQS PLTWAPG FYRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMK 
GGHTFKPLAEIYEQHVTKVNEEVAKLRRRLMELISLVQEVERNV 
EAVRNAKDERVRE IRNAVEMM IARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
AS FVTTP VP PDFTSELVPS YDSATFVLENFSTLRQRADPVYS P P 
LQVSGLCWRLKVYPDGNGWRGYYLSVFLELSAGLPETSKYEYR 
VEMVHQSCNDPTKNI IREFASDFEVGECWGYNRFFRLDLLANEG 
YLN^QNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERLTIELSRTQKSRDLSPPDNHLSPQNDDALETRAKKSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine/ 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER\GPYSAS\VREAKEDEEDEEKIQNEDYHHELSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 
EYNNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 
ATSSLLD IDPLILIHLLDLKDRSS IENLWGLQPRPPASLLQPTA 
SYSRKDKDQRKQQAMWRVPSDLKMLKRLKTQMAEVRCMKTDVKN 
TLSEI KSSSAASGDMQTSbFSADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSIJRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDIIiPKTE 
DRQCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
QEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEQIGPEDIiSFNT 
DENSGR 


5979 


212 


3665 

• 


LPDMTMYLWLKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNASE 

TTTLS PSGSAVI STTT I ATTP SKPTCDE KYAN I TVD YL YNKETK 

LFTAKLNWENVECGNNTCTNNEVHNLTECKNASVSISHNSCTA 

PDKTLIIiDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 

DTQN ITYRFQCGNMIFDNKE I KLENLE PEHE YKCDSE ILYNSHK 

FTNAS KI IKTDFGS PGEPQI I FCRS EAAHQGVITVJNPPQRS FHN 

FTLCY I KETEKDCLNLDKNLI KYDLQNLKPYTKYVLSLHAY I IA 

KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 

DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 

AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFLIIVTSIAIiLW 

LYKI YDLHKKRS CNLDEQQELVERDDE KQLMNVE PIHAD I LLET 

YKRKI ADEGRLFLAEFQS I PRVFS KFP I KEARKP FNQNKNRYVD 

ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 

DETVDDFWRMIWEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 

RAFGECCCKDIiTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 

FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 

TGTYIGI DAMLEGLEAENKVDVYG YWKLRRQRCLMVQVEAQY I 

LIHQAIjVEYNQFGETEVNLSELHPYLHNMKKRDPPSEPSPLEAE 

FQRLPS YRSWRTQHIGNQE \ENKSKNRNSNVIP YDYNRVPLKHE 

LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 

AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 

EGKQTYGDIEVDLKDTDKSSTYTLRVFEIjRHSKRKDSRTVYQYQ 

YTNWSVEQLPAEPKELISMIQVVKQKLPQKNSSEGNKHHKSTPL 

L I HCRDCaS QQTG I FCALLNLLES AE TE E WD I FQ WKALRKARP 

GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 

KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEGPEHSVNGP 

ASPALNQGS 


5980 


3 


2363 


DAWGCKLRRIiRFTYGTQTRVSLALPGQYEIjVHTLVAHQGNWETI 
PEEDLEVQENNEDAAHDLTELEVTOHHALLQEVDVVVAPCQGLR 
PTVDVLGDLVNDFLPVITYALHKDELSERDEQELQEIRKYFSFP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHLSTFSHQVLQTRLVDAAKALN 
LVHCHCLDIFINQAFDMQRDLQITPKRLEYTRKKENELYESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTREIKCCIRQIQEIjIISRLNQAVANKLISSVDYLRESFVGTL 
ERCLQSLEKSQDVS VHI TSNYLKQ ILNAAYHVEVTFHSGS SVTR 
MLWEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESLSASKLAK 
SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRIARLSIJESRSLQDVLIiHRKPKLGQEIjGRGQYGVVYLCDN 

wgghf p cal ks wp pdekhwndlale fh ymrs lpkherlvdlhg 
svidynygggssiavllimerlhrdlytglkagltletrlqial 
dwegirflhsqglvhrdiklknvlldkqnrakitdlgfckpea 
mms gs i vgtp i hmapelftgk ydns vdvyafg i lfw y i cs gs vk 
lpeafercas kdhlwknvrrgar perlp vfdeecwqlmeacwdg 
dplkrpllgivqpmlqgimnrlcksXnseqpnrglddst 


5981 


1 


2519 


GRRHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 



418 
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SEQ Predicted 
ID beginning 
NO: nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted encT 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5982 



56 



5983 



248 



5984 



755 



5985 



22 



2316 



Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
TE FGMA I G P EN S G KWLTAE VS GGS RGG R I FRS S D FAKNF VQTD 
LP FHPLTQMMYS PQNSDYLLALSTENGLWVSKNFGGKWEE IHKA 
VCLAKWGSDNTI FFTTYANGS CKADLGALEIiWRTSDLGKSFKTI 
GVKIYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
S VGQEQ F YS I LAANDDMVFMHVDEPGDTGFGT I PTS DDRG I VYS 
KS LDRHLYTTTGGETDFTNVTSIiRGVYI TSVLSEDNS IQTMITF 
DQGGRWTHLRKPENSECDATAKNKNECSLHIHASYSISQKLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EGPHYYTILDSGGIIVAIEHSSRPINVI KFSTDEGQCWQTYTFT 
RDPIYFTGliASEPGARSMNIS I WGFTES FLTSQWVS YT I DFKD I 
LERNCEEKDYTI WLAHS TDPED YEDGCI LGYKEQ FLRLRKSSVC 
QNGRDYWTKQPS I CLCSLEDFLCDFG YYRPENDSKCVEQPBLK 
GHDLE FCL YGREEHLTTNGYRK I PGDKCQGGVNPVREVKDLKKK 

ctsnflspekqnsksnsvpiiiaivglmlvtvvagvlivkkyvc 
ggrflvhlysvlqqh\aea\ngvdgvdaldtashtnksgyhdds 

DEDLLE 



1763 



1193 



1408 



ATRP PRGSS WCRQFSRTASAAPGRSNMLRI P VRKALVGLS KS P K 
GCVRTTATAASNLIEVFVDGQSVMVEPGTTVLQACEKVGMQIPR 
FCYHERLSVAGNCRMCLVE IEKAPKWAACAMPVMKGWNI LTNS 
EKSKKAREGVMEFLLANHPLDCPI CDQGGECDLQDQSMMFGNDR 
SRFLEGKRAVEDKNI G PLVKT I MTRCIQCTRCIRFAS E I AG VDD 
LGTTGRGNDMQVGTYI EKMFMSELSGNI IDI CPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
EE W ISDKTR FAYDGLKRQRLTBPMVRNEKGLLTYTSWEDALSRV 
AGMLQS FQG KDVAAI AGGLVDAEALVAL KDLLNRVDS DTL CTE E 
VFPTAGAGTDLRSNYLLNTriAGVEEADWLLVGTNPRFEAPLF 
N/UIIRKSWLHNDLKVALIGSPVDLTYTYDHLGDSPKILQDIASG 
SHPFSQVLKEAKKPMWLGSSALQRNDGAAILAAVSSIAQKIRM 
TSGVTGDWKVMNILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LLGADGGCI TRQDLPKDCFI I YQGHHGDVGAPIADVILPGAAYT 
EKS AT YVNT EGRAQQTKVAVTP PG LAREDWKI IRALSEIAGMTL 
PYDTL\DQVRNRLEEVSPNLVRYDDIEG\ANYFQQANELSKLVN 

QQLLADPLVPPQLTMKDFYMTDSISRASQTMAKCVKAVTEGADA 
VEEPSIC 



EARGDGGRRRHRASGRRAGRGE P \AGLKSQGQRAVPKRAVARGG 
RQ \ YSAAIALLEPAGSB IADDLS IL YSNRAACYLKEGNCSGC IQ 
DCNRALELHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLQLANDSVNRLSRILMELDGPNWREKLSL1 PAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQQGITDEKTFKALKEEGNQCVNDKNYK 
DALSKYSECLKINNKECAIYTNRALCYLKLCQFEEAKQDCDQAL 
Q LADGNVECAF YRRALAHKGLKNYQKS LI DLNKVI LLDPS 1 1 EAK 
MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLLAITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 
LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLEDIQALKRQYEIi 



S S VCMACT YVS NLG KKQR S VS FLASGLMRVSTGPELRLHHS FVL 
TGDVGRRICRLLVGLFTKGDTSSKRVHPFSPGPCFLLCDLARVG 
SSPKINVSPFYQN\QTSTQRSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRS I S RFS SG 



RRVARPGTAEPAKARRTVRRGRARRDLAGAERKAGVSERGDSGR 
RRPNPS I PSAAAGMSHIQ I PPGLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPASVLPAATPRQS LGHPPPEPG PDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKrDEQRCRLQEACKDILLFKNLDQEQLSQVLDAMFERIVKAD 
EHVIDQGDDGDNF YVI ERGT YDI LVTKDNQTRSVGQYDNRGS FG 
ELALMYNTPRAATIVATSEGSLWGLDRVTFRRIIVKNNAKKRKM 
FESFIESVPLLKSLEVSERMKIVDVIGEKIYKR/DGERIITQGE 
K\ADSFYIIESGEVSILIRSRTKSNKDGGNQEVEIARCHKGQYF 
GELALVTNKPRAASAYAVGDVKCLVMDVQAFERLLGPCMDIMKR 
N I SH YE EQ LVKM FGS S VDLGNLGQ 
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\ SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histiaine, l=lsoleucine f K=Lysme, 
L=Ijeucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , VaValine, 
W=Tryptophan , Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDEFLQT \YGSL I PLS TDEWEKLED I FQQEFSTP 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDI FNKBLLL I PIHLEVHWSLI S VDVRRRTI TYFDSQ 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
L I SNVCS IGDHVAQELFQGSDLGMAEEAERPGEK\AGQHS PLRE 
EHVTCVQS I LDE FLQT \ YGS LIP LSTDEWEKLEDI FQQE FS T P 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVD I FNKELLL I P I HLE VHWS LI S VDVRRRT I T Y FDSQ 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5988 


1292 


410 


FKKYFLSFLGLLESSHSRDRIHNLVLMFLLATHNLVWWFTCRFQ 
RLDCIYLNAGIMPNPQLNIKALLFGLFS\AEGLLTQGDKITADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHS KGKEP YSS S KYATDLLS VALNRNFNQQGLYSNVAC 
PGTALTNLTYG ILPPF I WTLLMPAILLLRFFANAFTLTPYNGTE 
ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKLLBLEKHIRVTIQKTDNQARLSGSCL 


5989 

• 


194 


2610 

» 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFVVDGVHFKAHKAVIiA 
ACSE YFKMLFVDQKDWHLD ISNAAGLGQVLEFMYTAKLSLS PE 
nvddvl\avatflqmqdi ITACHALKSLAEPATSPGGNAEALAT 
EGGDKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEQSEEGAGPAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRHIRIHTGEKPFSCRECSKAFSDPAACK 
AHEKTHSPLKPYGCEECGKSYRLISLLNLRKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGE KP YQCDYCGRS FSD PTS KMRHLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
E KP CQCVMCGKAFTQAS S L I AHVRQHTGEKP YVCERCGKRFVQS 
SQLANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHI I IHTGEKPY 
LCDKCX3RGFNRVDNLRSHVKTVHQGKAG I KI LEPEEGS EVSWT 
VDDMVTIiATEALAATAVTQLTWP VGAAVTADETEVLKAEI S KA 
VKQVQEEDPNTHILYACDSCGDKFLDANSLAQHVRIHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPSLRDKDLEMEELMLQDETLLGTMQSYMDASLIS 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVASFSGQILAGELDNCVSSIPDFP 
MHLACPEEEDKATAAEMAVPAAGDES ISSLSELVRAMHPYCLPN 
LTHLASLEDELQEQPDDLTLPEGCWLEIVGQAATAGDDLEIPV 

t nmnuc Ti/^ 1 TTD m 7T T TiTkC T "CPOCRT r\T T MDTT.T?GT?T l T72i2WDTf\7 T rT. 
V VKy VoFvar'K.fc' VLiUJlJoljCi I ooi^y Liljr'lir 1 J-»E»»aJti« iiirtrW Jris.v A U 

CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENS S PKN 
LERSAGQSS PAKEGPLDLYPKLADTI QTNP IPTHLSLVDS AQAS 
PMP VDS VEAD P TAVGP VLAG P VP VD P GLVDLAS T S SE LVE PLPA 
EP VL INPVLADS AAVDPAWPISDNL P P VDAVPSG PAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
. H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


* 






DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES 
LDPPKTI I P E VKEWDS LKI ES GTS ATTHEARPRPLS LS EYRRR 
RQQRQAETEERSPQPPTGKWPSLPETPTGLADIPCLVIPPAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
HKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRE 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVDIPQEKRPLDRLQAPELANVAGLTPPATPPHQLWKPLA 
AVSLLAKAKS PKSTAQEGTLKPEGVTEAKHPAAVRLQEGVHGPS 
RVHVGSGDHDYC\VRSRTPPKK\MPALLIPEVGSRWNVKRHQDI 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCIAPS 
SLLSPEASPCRNDMNTRTPPEPSAKQRSMRCYRKACRSASPSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PS PRRRSDRRRRYSS YRSHDHYQRQRVLQKERAI EERRWFIGK 
I PGRMTRSELKQRFSVFGE IEECT IHFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
DPAPVKSKFDSIiDFDTLLKQAQKNLRR 


5991 


334 

• 


. 1379 


RLSSHFSQCS PS IYC \TKFDKQGNVTS FERKKTELYQELGLQAR 
DLRFQHVMSITVRNNRIIMRMEYLKAVITPECLLILDYRNLNLK 
QWLFRELPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKLHILLQNGKSLSELETDI 
. KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLENYYRIzADDLSNAARELRVLIDDSQSIIFINLDSHRNVMM 
RLNLQIjTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 

♦ * 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS ILGTGLLWLPGGI KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFEATRLIATIVMLLCFIFTIjCAALWWHKKGLAVLFC I LQ 
FLSMTWYSLS Y I P YARDAVI KCCS SLLS 


5993 


1650 

• 


594 


AEGLGS WAVWAGLG WAGRHMEAGGATGALGVGC KLPSAFC F PGS 
SVAMDMFQKVEKIGEGTYGWYKAKNRETGQLVALKKIRLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDVVHNERKIjYLVFEFLSQ 
DLKKYMDSTPGS ELPLHL I KS YLFQLLQGVS FCHSHR VIHRDLK 
PQNLLINELGAI KLADFGLARAFGVPLRTYTHEWTLWYRAPEI 
LLATRFYTTAVDIWSIGCIFAEMVTRKALFPGDS\EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGS FP KWTRKGLEE I VPNLE PEG 
RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


5994 


394 

* 


1934 

• 


AGEVQLHVWIRGMRIQPQ/KAAAIIDLDPDFEPQSRPRSCTWPL 
PRPE I ANQPS KPPEVEPDLGEKVHTEGRSEP I LLPSRLPE PAGG 
PQPG I LGAVTGPRKGGS RRNAWGNQS YAELISQAI ES APEKRLT 
LAQI YEWMVRTVPYFKDKGDSNSSAGWKNS IRHNLSLHSKF I KV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDS S S KLLRGRS KA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRLSPLRPESEVLAEEIPASVSSYAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFSPAEGPLSAGEGCFSSSQALEALLTSDTPPPPADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNI ISDLMDEGEGIiDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQELLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I SDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMS VMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 
QHRQT\QSDIjTIEKISALENSKNSDLEKKEGRIDDLLRANCDLR 
RQ I \DEQQKMLEKYK\ ERLNRCFDNEPRNFLI EKS KQEKMACRD 
KSMQDRLRLGHFTTVRHGASFTEQWTDGYAFQNLIKQQERINSQ 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTNGAENETL 
TLAE YHEQEE I FKLRLGHLKKEEAE IQAELERLERVRNLHIREL 
KRIHNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACREYRIHKELDHPRIVKL 
YDYFSLDTDSFCTVLEYCEGNDIjDFYLKQHKLMSEKEARSIIMQ 
I VNALKYIiNE I KPP 1 1 HYDLKPGN I LLVNGTACGE I K I TDFGLS 

kimdddsynsvdgmeltsqgagtywylppecfwgkeppkisnk 
vdvwsvgvi fyqclygrkpfghnqsqqdilqentilkatevqfp 

PKP WTPEAKAFIRRCLAYRKEDR IDVQQLACDPYLLPH IRKS V 

stsspagaaiastsgasnnsssn 


5996 


1612 


981 


DQQACLLGLMLTLEFGI LEFDPS W IGSWTQR/ SWVS WRSRPGCE 
LFS IWFGS IVNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQQACIaLGIMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS iyVFGS IVNEGYLNSASEGEEFCI ynrnpnacsygvavgvl 

afltcllylaldvyfpqissvkdrkk\avlsghpwsgephpaa 
fwaflwftgdscyl\anqwqvskpkdnplnegtdaspgrpspfs 
ffs i ftws ltaalavrrfkdls fqee ystlfp \ asaqp 


5998 


1612 


981 


dqqacllglmltlefgilefdpswigswtqr/swvswrsrpgce 
lfs iwfgs i vnegylns asegee fc i ynrnpnacsygvavgvl 
afltcllylaldvyfpqissvkdrkk\avlsghpwsgephpaa 
fwaflwftgdscyl\anqwqvskpkdnplnbgtdaspgrpspfs 
ffs i ftwsltaalavrrfkdls fqee ys tlp p \ as aqp 


5999 


2 


1790 


RPPME KARRGGDGVPRGPVLHI WVG FHHKKGCQVEFS YP PLI P 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFGISCYR\QIEAKALKVRQADITRETVQKSVCVLSKLPLYG 
LLQAKLQLITHAYFEE KDFSQI S ILKELYEHMNS SLGGAS LEGS 
QVYLGLSPRDLVLHFRHKGLILFKL I LLEKKVLFY I S PVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGT I RKVMAGNHGEDAAMKTEE PLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFPKDS VPSESLPITVQPQANTGQWLI PGLI SGLE 
EDQYGMPLAIFTKGYLCLPYMALQQHHLLSDVTVRGFVAGATNI 
LFRQQ KHLSDA I VE VEEAL I Q I HDPELRKLLNPTTABLR FAD YL 
VRHVTENRDDVFLDGTGWEGGDEW IRAQFAVY IHALLAATLQLV 
LFRI VNVAKKI GNVMVTT\ SRNWQTGK\AVGQSVGGAFS \SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


6000 


101 

■ 


1561 


TE PCRTAENCT ATMS ENNKNS LES S LRQLKCHFTWNLMEG ENS L 
DD FEDKVF YRTEFQNRE F KATM CNLLAYLKHLKGQNE AALE CLR 
KAEELIQQEHADQAE I RSLVTWGNYAWVYYHMGRLSDVQ I YVDK 
VKHVCEKFSS PYR I ES PELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\KFYRGKDEPDKAIELLKKALBYIP\NNAYLHCQIGCCY 
RAKVFQVMNLRENGMYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCS I LASLHALADQ YEDAE Y YFQKEFS KELTPVAKQLLHLRYGN 
FQLYQMKCEDKAI HH F IEGVKINQKSRE KEKMKDKLQK I AKMRL 
SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 

SWNGE 


6001 


176 


1038 


AFAHSPSRGHRHTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 

WLHFDADGSGYLE GKELQNL I Q ELQQARKKAGLELS P EMKT FVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGFIETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L= Leu cine, M»Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /apossible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLFDS NNDGKLE LTEMARLLP VQENFLLKFQG I KMCGKE FNKA 
FELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGG KLYRTDLAL I LCAGDN 


6002 


311 


81 


LAPPGGGLHI PPRTPLSHSRPPPSHHAPHPS PLPLPPADLHPHS 
SMAQRSDLLELDCQLTRDRVWVSHDENLCRQSGLNRDVGSLDF 
EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 
EIKGKNBELIREQ/VLVRRYDRNEITIWASEKSSVMKKCKAANP 
E M PLS FT I S RG FWVLLS Y YLGLLP FI P I PE KFFFCFL PN I INRT 
YFPFSCSCLNQLLAWSKWLIMRKSLIRHLEERGVQWFWCLNE 
ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


140 

* 


4098 


GKLRAFRGMRRLI CKR I CDYKS FDDEES VDGNRPS S AAS AFKVP 
AP KTSGNP ANS ARKPG S AGG P KVGAGAS KEGGAGAVDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIRErLSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 
HLS TVLGNKFDHGARAI VPTL FNLVPNS AKVMATSGCAA I RF 1 1 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVET I KKG IHDADAEARVEARKTYMGLRNHFPGEAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSS KAS S LPGSLQRSRSDIDVNAAAGAK 
AHHAAGQS VRS GRLGAG ALNAGS YAS LEDTSD KLDGTAS EDGRV 
RAKLS APLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREAS PSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPG YG I SQSSRLS SSVSAMRVLNTGSDVEEA 
VADALLLGD I RTKKKPARRR YES YGMHSDDDANSDASSACS ERS 
YS SRNGS I PT YMRQT\ ED V\ AEVLNRCASSNWS ERKEGLLGLQN 
LLKNQRTLSRVELKRL CE I FTRMFAD PHGKRV FS MFLE TL VDF I 
QVHKDDLQDWLFVLLTQLLKXMGADLLGSVQAKVQKALDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RS PANWS S PLTSPTNTSQNTLS PS AFD YDTENMNS ED I YS S LRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
I RALALKVLRE I LRHQPARFKNYAELTVMKTLEAHKDPHKE VVR 
SAEEAASV\LATSI \SPEQCIKVLCPI IQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGSKMKLLNLYI KRAQTGSGGADPTTDVS 
GQS 


6004 


140 


4098 

* 


GKLRAFRGMRRL I CKRICDYKS FDDEES VDGNRPS SAASAFKVP 
AP KTS GN PANSARKPGSAGGP KVGAGAS KEGGAGAVDE DDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
S LL VAGAAQ YDCFFQHLRLLDGALKLSAKDLRSQ WREACI TVA 
HLS TVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAI RF 1 1 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVET I KKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGSYASLEDTSDKLDGTASEDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVS SGVQRVLVNSASAQKRS KI PRSQGCSREAS PSRLSV j 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS I 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEEA 
VADALLLGD I RTKKKPARRR YES YGMHSDDDANSD7VSSACSERS 
YS SRNGS I PT YMRQT\ EDV\ AB VIjNRCASSNWS ERKEGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 
FPNDLQFN I LMRFTVDQTQTPS LKVKVAI LKY I ETLAKQMDPGD 
F I NS S ETRLAVS RV I TWTTEP KS SD VRKAAQ S VL I S L F E LNTPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine , M=Methionine , N=*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S« Serine, T=Threonine, V^Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop I 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








F TMLLGALP KTFQDGATKLLHNHLRNTGNGTQSS MGS PLTRPT P 
RSPANWS SPLTSPTKTSQNTLS PSAFDYDTENMNSED IY SSLRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAb\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 
VEERKIALYELMKLTQEESFSWDEHFKTILLLLLETLGDKEPT 
I RALALKVLRE I LRHQ PARFKNYAELTVMKTLEAHKDPHKEWR 
SAEEAA£V\ LATSI \ S PEQC I KVLCPI I QTADYP INLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 


6005 

• 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALIi 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

KRQKKERMLLCRQLGD S SGEG PEFVEE SEE VALRSD S EGSDYTP 

GKKKKKKLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLL 

EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 

KMMMVLGAKWREFSTNNPFKGSSGASVAAAAAAAVAVVESMVTA 

TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 

PKKVAPLKI KLGGFGSKRKRSSSEDDDLDVESDFDDAS INSYSV 

SDGSTSRSSRSRKKIiRTTKKKKKGEEEVTAVDGYETDHQDYCEV 

CQQGGEIILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEEILEEVGGDLEEEDDHHMEFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 

YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRIIjNHSVDKKG 

HVHYLIKWRDLPYDQASWESEDVE IQDYDLFKQS YWNHRELMRG 

EEGRPGKKLKiCVKLiRKLERPPETPTVDPTVKYERQPEYLDATGG 

TLH P YQMEGLNWLRF S WAQGTDT I LADEMGLGKTVQTAVFL Y S L 

YKEGHSKGPFLVSAPLSTI IN\WERE FEMWAPDMYV\ VTYVGDK 

DSRAIIRENEFS\FEDNAIRGGKKASRMKKEASVKFHVLLTSYE 

LI TIDMAILGS IDWACLI VDEAHRLKNNQSKFFRVLNGYS LQHK 

LLLTGTP LQNNLE ELFHLLNFLTPER FHNLEG FLEEFAD I AKE D 

QI KKLHDMLG\ PHMLRRLKADVFKNMPS KTELI V\RVELS PM\ Q 

KKYYK\YILHSKFIiKALN\ARGGGNQVSLLNVVMDLKKCCNHPY 

LF PVAAMEAP KM PNGMYDGSAL I RAS GKLLLLQ KMLKNLKEGGH 

RVL I FSQMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAI DR 

FNAPGAQQFCFLLSTRAGGLGINLATADTVI IYDSDWNPHND IQ 

AFSRAHR I GQNKKVMI YRFVTRAS VE ER I TQ VAKKKMMLTHL W 

RPGLGS KTGSMS KQELDD I LKFGTEELFKDEATDGGGDNKEGED 

SSVIHYDDKAIERLLDRNQDETEDTELQGMNEYLSSFKVAQYW 

REEEMGEEEEVEREIIKQEESVDPDYWEKLLRHHYEQQQEDLAR 

NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 

QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSE KE FKAYVS LF 

MRHLCE P GADGAE T FADGV PREGLSRQHVLTRI G VMS LI R KKVQ 

E FEHVNGRWSMPELAEVEENKKMSQPGS PSPKTPTPSTPGDTQP 

NTPAPVP PAEDG I KI EENS LKEEES I EGEKEVKS TAP ETAI ECT 

QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMLQNGETPK 

DLNDEKQKKNIKQRFMFNIADGGFTELHSLWQNEERAATVTKKT 

YE I WHRRHD YWLLAG I INHGYARWQD IQNDPRYA I LNE P FKG EM 

NRGNFLE I KNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 

PSMALNTRFAEVECLAESHQHLSKESMAGNKPANAVLHKVLKQL 

E E LLSDM KAD VTRLP AT I AR I P P VAVRLQMS ERN I LSRLANRAP 

EPTPQQVAQQQ 


6006 


1 


965 


DNDFLRNTVHRHEPPVTAEPIRLLAENEDWWDKPSSIPVHPC 
GRFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 

ER I HEQ VRDRQLE KE YVCR VEGE FPTE E VTCKEP I LWS YKVG V 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGHPILNDPIYNSVAWGPSRGRGGYIPKTNEELLRDLVAEHQAK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cys teine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
! L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QSLDVLDLCEGDLSPGLTDSTAPSSELGKDDLEELAAAA\QKMB~ 

EVAEAAPQELDTIALASEKAVETDVMNQ\RQT\TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQVE YVFTDKTGTLTENEMQFRECS INGMKYQE INGRLVPE 
GPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRTSPENETELIK 
EHDLFFKAVSLCHTVQINNVQTDCTGDGPWQSNLAPSQLEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTU5KLERYKLLHILE 
FDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGBIEKTRI 
HVDEFAIiKGLRTLCIAYRKFTSKEYEE IDKRI FEARTALQQR\E 
EKLAAVPQF I EKDLILLGATAVEDRLQDKVRETI EALRMAG I KV 
WVLTGDKHETAVSVSLSCGHFHRTMNILELINQKSDSECAEQLR 
QLARR I TEDHVIQHGL WDGTS LSLALREHEKLFMEVCRNCSAV 
LCCRMAPLQKAKVIRLIKISPEKPITLAVGDGANDVSMIQEAHV 
GIG IMGKEGRQAARNSDYAI ARFKFLS KLLFVHGHFYYIRIATL 
VQYFFYKNVCFITPQFLYQFYCLFSQQTLYDSVYLTLY\NICFT 
SL P I L I YSLLEQHVDPHVLQNKPTLYRD I S KNRLLS I KTFL YWT 
ILG FS HAF I F FFG S YLL IG KDTS LLGNGQM FGNWTFGTLVFTVM 
VITVTVKMALETHFWTW INHLVTWGS I IFYFVFSLFYGGILWPF 
LGSQWMYFVFIQLLSSGSAWFAI ILMWTCLFLDI I KKVFDRHL 
HPTSTEKAQLTETOAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCS PTH I SRSWS ASDP FYTNDRS I LTLSTMDS STC 


6003 


4554 


| 1089 


AGVRRAGARRGPGRALP AGAT AVPP P S ARRRRRCPAPEHAG PAR " 

ASRP S QE TMFQLP VNNLGS LRKARKTVKKILSD I GLE YCKEHI E 

DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 

FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTFNADKKTLETH 

IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 

KCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSLNGAVPLG 

SNAREESSIHCKRCLFMPKSYEALVQHVIEDHERIGYQVTAMIG 

HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 

SQQM VNRLS I PKPNLNSTGVNMMSS VHLQQNNYGVKSVGQGYS V 

GQSMRLGLGGNAPVS I PQQSQSVKQLLPSGNGRSYGLGSEQRSQ 

APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 

ATGPPPGNTSSTQKWK ICT I CNELFPENVYS VHFEKEHKAE KVP 

AVANY I M K I HNFTS KCL YCNR YLPTDTLLNHML I HGLS CP YCRS 

TFNDVEKMAAHMRMVHIDEEMGPKTDSTIiSFDLTIiQQGSHTNIH 

LLVTT YNLRDAPAE S VAYHAQNNPPVPPKPQPKVQEKADI P VKS 

SPQAAVPYKKDVGKTLCPLCFSILKGPISDALAHHLRERHQVIQ 

TVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKTQN 

GQDKTNAPSRLNQS PS LAP VKRT YEQME F PLLKKR KLDDDSDS P 

SFFEEKPEEPWIiALDPKGH\EDDSYEARKSFLTKYFT\KQPYP 

TRRE I E KLAAS LWV \ WK\ SD I AS H FSNKRKKCVRDCE KYKPGVL 

LGFNMKELNKVKHEMDFDAEGLFENHDEKDSRVNASKTADKKLN 

LGKEDDSSSDS FENLEESSNESGS PFDPVFEVEPKISNDNPEEH 

VLKVI PEDASES EEKLDQKEDGS KYET I HLTEEPTKLMHNASDS 

EVDQDDWEWKDGAS PSESGPGS QQVSDFEDNTCEMKPGTWSDE 

SSQSEDARSSKPAAKKKATMQGDREQLKWKNSSYGKVEGFWSKD 

QSQWKNASENDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEP 

MHGS LAGVKLS SQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLS LQG * E PH * AA*QAVRSEEKS I C*GS PSC 
HLVLGVLVPVARQSSHSAGPAQSAFR*TGTGSGTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQASQRRTVFTAGGGECLGAKSVRASVFTGNQPGVMGLL 
NGKRGGCFESGYLPGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCR I H I VDAVC * S EHH * DH FLAAAF LENS TI IS* VAPG S WQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
FVLAPQDGEGVPFVEGQLVTVLGL WPQS IRHTFVHHTQLFLHP 
I * KLGALDVAFLHLLTLVCS S FNVAYG* GKNGGTTLHQLFAEVN 
AVTRGS AVQRRPS ITI SS IHVDTKIQQELHDVMVAGADGWQ WG 
DPF WGLAG I FHL I DDPLHQ I ELS FQRRV* EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHVW I VLCRLGS L VGGLGTDE LL WFGGR * L 1 1 1 G 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=» Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



I * * RGRLSGEW3CGLGRGELFQVS IGIGVS IVHIGQGDHEVLGG 
AGLVERGALHATGQGVEALVQQLLDVGPAGALGLCDGAALFQGP 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ* PVKKRQRRWRG*TR 
R*NGLTIHCFN* LI *GAVCCRLVI LRWCGLLEVHGVYGT* IHCL 
GSFPGRLWP*PFISQERPNGHCQWEFRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL * GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LPPFQGACRPRTQRCRTWVCPIAWRQLLAYTRD 



6010 



3533 



6011 



446 



6012 



351 



IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AG I SQNAKTGDLPAFGECVGI AS KALCGLTEAAAQAAYLVGI FD 
PNSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 
T I VAKHTS AL CNACR I AS S KTAN P VAKRH FVQ S AKEVANS T ANL 
VKTI KALDGDFS EDNRNKCRI ATAPLIEAVENLTAFASNPEFVS 
IPAQI SSEGSQAQEPILVSAKPMLES SSYLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
IRDI EQASIiAAVSQSLATRDDISVEALQEQLTS WQEIGHL I DP 
IATAARGEAAQLGHKGTQLASYFEPLIIiAAVGVASKILDHQQQM 
TVLDQTKTIAESALQMIjYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD I M VTLNE AASEVGLVGGMVDA IAEAMS KLDEGTP PE P KG 
TFVDYQTTWKYS KAIAVTAQEMMTKS VTNPE E LGGLAS QMTSD 
YGHLAFQGQMAAATAEPEEIGFQ IRTRVQDLGHGCIFLVQKAG\ 
ALQVCPTDSYTKRELIECARAVTBKVSIiVLSALQAGNKGTQACI 
TAATAVSGI I ADLDTTIMFATAGTLNAENSETFADHRENILKTA 
KALVEDTKLLVSGAASTPDKLAQAAQSSAATITQLAEVVKLGAA 
SLGSDDPETQWIiINAIKDVAKALSDLISATKGAASKFVDDPSM 
YOLKGAAKVMVTNVTS LLKTVKAVEDEATRGTRALEAT I EC I KQ 
ELTVFQS KDVPEKTSSPEES IRMTKG I TMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TLGYLDLLEHVLV I LQKP T PELKQQLAAFS KRVAGAVTEL I QAA 
EAMKGTEWVDPEDPTVIAETELLGAAAS I EAAAKKLEQLKP RAK 
P KQADETLDFEEQ ILEAAKS I AAATSALVKSAS AAQRELVAQGK 
VGS I PANAADDGQWS QGLI S AARMVAAATSSLCE AANASVQGHA 
SEEKLISSAKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDD VWKTKFVGG I AQ 1 1 AAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 



1835 



LLQPAMRKS PGLS DCLWAWI LLLSTLTGRS YGQ PSLQDELKDNT 
TVFTRILDRLLDGYDNRLRPGLGERVTEVKTDIFVTSFGPVSDH 
DMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTF 
FHNG KKS VAHNMTMPNKLLR I TEDGTLLYTMRLTVR \ AEC PMAF 
G RDF PM \D \ AHAC PLKFGS YAYTRAE WYE WTRE PARS VWAED 
GSRLNQ YDLLGQT VDSGI VQS S TGEYWMTTHFHL KRKIG YF V I 
QTYLPC IMTVILSQVS FWLNRES VPARTVFGVTTVLTMTTLSI S 
ARNSLP KVAYATAMDW FI AVC YAF VFS AL I EFATVN YFTKRG YA 
WDG KS WP E KPKKVKD PLI KKNNTYAPTATS YT PNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
GIFNLVYWATYLNREPQLKAPTPHQ 



5013 



PAELFQSFAIWHKELYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EGIQVRE IAC IQKDKD I PAED I I CEYFEPKPLLEQACL I PCQQD 
C I VS E FS AWS ECS KTCG SGLQHRTRHWAP PQFGGSGCPNLTE F 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRS KGVKDPEARELI KKKRNRNRQNRQENKYWDI QIGYQTR 
EVMCINKTGKAADLS FCQQEKLPMTFQS CVI TKECQVS EWSE WS 
PCSKTCHDMVSPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
QGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTALCGGG 
I QTRE V YC VQANENLLS QLSTHKNKE AS KPMDLKLCTG P I PNTT 
QLCHI PCPTECEVS PWSAWGPCTYENCNDQQGKKGFKLRKRRIT 

NE P TGGS GVTGNCPHLLEAI P GEE PAC YDW KAVRLGDCE PDNG K 
EGG PGTQVQE WCINS DGEEVDRQLCRDAI FP I PVACDAPCP KD 
CTLSTWSTWSSCSHTCSGKTTEGKQIRARS I LAYAGEEGGIRCP 
NSSALQEVRSCNEHPCTVYHWQTGPWGQCIEDTSVSSFNTrTTW 
NGEASCSVGMQTRKVICVRWVGQVGPKKCPESLRPETVRPCLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H«Histidine, I»Iaoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 


• 


• 


PCKKDCIVTPYSDWTSCPS\SCKEGDSS IRKQSRHRVI IQLPAff" 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\IjVP\WS 

vqqds p\gaqegcgpgrqarai tcrkqdggqagiheclq yagpv 

paltqacqipcqddcqltswskfsscngdcx^vrtrkrtlvgks | 

kkkekcknshlyplietqycpcdkynaqpvgnwsdcilpegkve 

vllgmkvqgdikecgqgyryqamacydqngrlvetsrcnshgyi 

eeaciipcpsdcklsewsnwsrcskscgsgvkvrskwlrekpyn 

ggrpcpkldhvnqaqvyevvpchsdcnq ylwvtepws i ckvtfv 

nmrencgegvqtrkvrcmqntadgpsehvedylcdpeemplgsr 

vcklpcpedcvisewgpwtqcvlpcnqssfrqrsadpirqpade 

grscpnavbkepcnlnkncyhydynvtdwstcqlsekavcgngi 

KTRMLDCVRSDGKSVDLKYCEALGLEKNWQMNTSCMVECPVNCQ : 

LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLMDQS 

KPCPVKPCYRWQYGQWS P CQVQEAQCGEGTRTRNIS CWSDGSA 

DDFS KWDEEFCADI ELI I DGNKNMVLEE S CSQPCPGDCYLKDW 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEQML 

ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 

SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 

CTrjIPWVLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 

PFGPDGRLKTWVYGVAAGAFVLLI FI VSMI YLACKKPKKPQRRQ \ 

NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC j 
S I VDVE FL P VYHPS PEE S RDPTL YANNVQR VMAQALG I PAT ECE 
FVGSLPVIVVGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
ARPESNDQPGRVCQAATAL 


6014 


2857 


613 


EAVAGGMEKSRMNLPKGPDTLCFDKDSFMKEDFDVDHFVSDCKK 
RVQLEE1^DDLELYYKLLKTAMVELINKDYADF\VNLSTNLVGM 
DKALNQLS VP LGQLREE VLS LRS S VS EG I RAVDE RMS KQE D I RK 
KKMCVLRLIQVIRSVEKIEKILNSQSSiCETSALEASSPLLTGQI 
LERIATEFNQLQFHACQSK\GMPLLDXVRPRIAGITAMLQQSLE 
GLLLEGLQTS DVD I IRHCLRTYAT I DXTRDAEALVGQVLVKP Y I 
DE V 1 1 EQFVE SHPNGLQ VM YN KLLE F VPHHCRLLREVTGGA I S S 
EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 
TISMDFVRRLERQOSSQASVKRLRAHPAYHSFNKKWNLPVYFQI 
RFREIAGSLEAALTDVLEDAPAESPYCLLASHRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 
IKKPLVTGSKE PS ITQGNTEDQGSGPSETKP WS ISRTQLVYW 
ADLDKLQEQLPELLEIIKPKLEMIGFKNFSSISAATtBDSQSSFS 
ACVPSLSSKI IQDLSDSCFGFLKSALEVPRLYRRTNKEVPTTAS 
SYVDSALKPLFQLQSGHKDKLKQAIIQQWLEGTLSESTHKYYET 
VSDVLNSVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 
QLALDVEYLGEQIQKLGLQASDIKSFSALAELVAAAKDQATAEQ 
P 


6015 

» 


13 


2237 


AEGCAERRGTBPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
RITVWRSRSGNELPLAVAS TADL I RCKLLDVTGGLGTDELRLLY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDWIVDLRHELT 
HECKMPHINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAIKAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLS EL PALG I SG I R PT Y ILRW T VE L I VANTKTGRNARR F 
S AGQWEARRGWRLFNCSAS LDWPRMVES CLGS PCWAS PQLLR 1 1 
F\ KAMGQGLQDE \EQEKLLRI CSI YTQSG ENSLVQEGSEAS P I G 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQKEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDrHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\ S CG VGS \ GNCSNS S S SNFRGA FLLEARGS LH \ GL\ KTG LQLF 


6016. 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=* Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
R I TVWRS RSGNELPLAVASTADLIRCKLLDVTGGLGTDELRXJjY 

gmalvrfvnli serktkfakvplkclaqe vni pdw i vdlrhelt ; 
hkkmphindcrrgcyfvldwlqktywcrqlenslretweleefr 
egieeedqeedkniwdditeqkpepqddgkstesdvkadgdsk 
gs ee vds hckkalshke lyerarellvs ye e eqftvlekfrylp 

KAIKAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQIiAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERML S EL PALG I S G I RPT Y I LRWTVE L I VANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDW PRMVES CLGS PCWAS PQLLRI I 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
E KEVLPDQVE EEEENDDQEEEE EDEDDEDDE EEDRMEVGPFS TG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\ S CGVGS \GNCSNSSSSNFRGAFLLEARGS LH\GL \ KTGLQLF | 


6017 


203 


3469 


SHQE IEQNSAMAPRKRGGRGI S FI FCCFRNWDHPE ITYRLRNDS 
NFALQTMEPALPMPPVEELDVMFSELVDELDLTDKHREAMFALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLNSMAARKSLLAL 
EKEEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLSCILNFLK 
TMDYETSESRIHTSLIGCIKALMNNSQGRAHVLAHSESINVIAQ 
S LS TENIKTKVAVLE I LGAVCLVPGGHKKVLQAMLHYQ KYASE R 
TRFQTLINDLDKSTGRYRDE VSLKTAI MS F INAVLSQGAGVESL 
DFRLHLRYE\FLMLGIHPVMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHI DTKSATQMFELTRKRLTHSEAYPHFMS ILH 

hclqmpykrsgntvqywllldri iqqi viqndkgqdpdstplen 

fnik3wvrmlvnenevkqwkeqaekmrkehnelqqklekkerec 

daktqekeemmqtlnkmkeklekettehkqvkqqvadltaqlhe 

lsrravcasipggpspgapggpfpssvpgsllppppppplpggm 

lpppppplppggpppppgppplgaimpppgapmglalkkksipq 

ptnalks fnws klpenklegtvwte i ddtkvf ki ldledlertf 

sayqrqqdffvnsnskqkeadaiddtlssklkvkelsvidgrra 

qncnillsrlklsndeikrailtmdeqedlpkdmleqllkfvpe 

ksdidlleehkheldrmakadrflfemsrinhyqqrlqslyfkk 

kfaer vaevkpkveairsgs eevfrsgalkqlle wlafgnymn 

kgqrgnaygfki s s lnki adtkss i d kn itllhyl i t i venky p 

svlnij^elrdipqaakv™teldkeistlrsglka\^ 

ksqppqpgdkfvswsqfitvasfsfsdvedllaeakdlftkav 

khfgeeagkiqpdeffgifdqflqavseakqenenmrkkkeeee 

rrarmeaqlkeqrererkmrkakenseesgefddlvsalrsgev 

fdkdlsklkrnrkritnqmtdssrerpitklnf 


6018 


13 


2510 


TISQSGGIRRRREAVWFEWWMDFSRLHMYSPPQCVPENTGYTY 
ALSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 
GEAVGADSGTSSAVS LKNRAARTTKQRRS TNKSAFS INHVSRQV 
TS SGVS YGGTVSLQDAVTRRP P VLDE S W I REQTTVDHFWGLDDD 
GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLSERKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAGYFLLQILRRIGAVGQAVSRTAWSALWLAWAPGKAASGVF 
WWLGIGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 
LSLRGQG\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 
PLQGDSEAFPWHWMSGVEQQVASLSGQCHHHGENLRELTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HLEDILGKLREKSEAIQKELEQTKQKTISAVGEQLLPTVEHLQL 
ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMVKLLFSED 
QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQLPTSEAWSAVS EAGASG I TEAQARAI VNS ALKLYSQDKTG 
MVDFALESGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 
WIQPDIYPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTL 
SPTGNISSAPKDFAVYGLENEYQEEGQLLGQFTYDQDGESLQMF 
QALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVHGEPVK 


6019 


2 


1066 | TPNDREPPPQRPPSSRRASHLAQEITSAASLGDQTQILGSLTTA 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 

I location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

[ sequence 


1 Predicted end 
1 nucleotide 

location 

corresponding 

to first 
I amino acid 

residue of 

amino acid 
1 sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M«Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA 
QSLQVQAVTPQLLLNAQGQVIATLASSPLPPPVAVRK\PSTPES 
LLKSEVQPIKPTPTVPQPAWIASPAPAAKPSASAPIPITCSET 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFE KLDI TPKSAQKLKPVLEKWLN 
EAELRNQEGQQNLMEFVGGEPSKKRKRRTSFTPQAIEALNAYFE 
KNPLPTGQEITEIAKELNYDREWRWFOmRQTLKNTSKLNVF 
QIP 


6020 

• ■ 


1 4953 


549 

* 


EAIQFEVS IGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPVVTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 
QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI 1 1 WMIRGE KRLAYAR I PAHQVLYS TSGENASG KYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFS DVTGKI KLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGI TI PPDHKPKS WVAAEKMYHTHRRRRLVR KRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPS ETHGAAAI FKLEGALGADTTEDGDEKSLE KQKHSA 
TTVFGANTPIVSCNFDRDYIYHLRCYVYQARNLLALDKDSFSDP 
YAH I CFIiHRS KTTE I IHS TLNPTWDQTI I FDEVE I YGE PQTVLQ 

NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
H P VMNGD KACGDVL VTAE LI LRG KDGSNLP I LP PQRAPNL YMVP 
QG IRP WQLTAI E ILAWGLRNMKNFQMAS ITS PS LWBCGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKP WGQCT I ERLDRFRCDP YAGKEDIVPQLKASIjLS AP PGR 
D I VI EMEDTKPLLAS KCLS S MSTALS KMAS PAT VHLTE KEEE I V 
DWWSKFYASSGEHEKCX5QYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE \DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKIS VYDYD 
TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKIIjHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NIS \RY YLRVI I WNTKDVI LDEKS ITGEEMSDI YVKGWI PGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDMI PDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAE KDGAR VMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 | 


4953 


549 

* 


EAIQFEVS IGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPWTLTSYWEDISHRLDAVNTLIiAMAERLQTNIEALKSGI 
QGKI PANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPD II IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITI PPDHKPKS WVAAEKMYHTHRRRRLVR KRKKD 
LTQTASSTAGAMBELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPS ETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
Ti vraANTPIvSCNFDRDYIYHLRCYVYQARNLIiALDKDSFSDP 
YAHICFLHRSKTTEI IHSTLNPTWDQTI I FDEVE I YGE PQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
HP VMNGDKACGDVLVTAE LI LRGKDGSNLP I LP PQRAPNL YMVP 
QG IRP WQLTAIE I LAWGLRNMKNFQMAS ITS PSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locacion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, c=Cysteine, D=Aspartic Acid, E= 

H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine , V»Val ine , 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








DI VI EMBDTKPLLAS KCLSSMSTALSKMAS PATVHLTEKEEE I V 
DWWS KFYAS SGEHEKCGQ Y IQKGYSKLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 

tj L» VTl^ T T TJ/"\TTT J^l\ f)TJ!?!)T 1r\ T UTT .DTH^T TrDlPtn/ li'TU'l'T.nCTCinD 

rcAN K.JL utiQ rii-rtjAr h bKlxfUjll 1 JjK. i Ulou V rCrl VH1K1 lulo 1 r\ilr 
NI S \ R YYLRVI I WNTKD V I LD EKS I TGEEMS D I YVKGW I PGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTIiTCRHTI 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLEILNEK3ADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6022 

* 


4953 

• 

* 


549 

• 


EAI Q FEVS I GNYGN KFDTTCKPLAS TTQ Y S RAVFDGNY YY YLPW 
AHTKPWTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 
QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAE IEDWLDKLMQLTE 
EPQNSMPDIIIWMIRGEIOtfiAYARIPAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQEKNNGP KVPVELRVNI WLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDE KG WE YG IT I PPDHKPKS WVAAEKMYHTHRRRRLVRKRKKD 
LTQTAS S TAGAMEE LQDQEGWEYASL I G WKFHW KQRS SDT FRRR 
RWRR KMAPSE THGAAAI FKLEGALGADTTEDGDEKS LEKQKHSA 
TT VFGANTP I VS CNFDRD Y I YHLRC YVYQARNLLALDKDS FSD P 
YAHI CFLHRSKTTEIIHSTLNPTWDQTI IFDEVE I YGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRSIFSPVVKLNSEMDITPKLLW 
H P VMNGDKACGDVL VTAE L I LRGKDGS NLP I LP PQRAPNLYMVP 
QG IRP WQLTAI E I IAWGLRNMKNFQMAS ITSPSLVVECGGERV 
ESWI KNLKKT PNF PSSVLFMKVFLP KEEL YMP PLV I KVIDHRQ 
FGRKP WGQCT I ERLDRFRCDP YAGKED I VPQL KAS LLS AP P CR 
D IVI EMEDTKPLLASKCLSSMSTALSKMAS PATVHLTEKEEE IV 
DWWS KFYASSGEHEKCGQY I QKG YSKLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDS VPQECTVR I Y I VRGLELQPQDNNGLCDP YI KITLG 
KKVI E \DRDH Y I PNTLNP VFGRM YELS CYLPQEKDLKI S VYD YD 
TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 

fPTiMVTT.uniJT / , atJT?trDT.BT.IITT.T3TYV^T»'\7DT7inrJ? r T r l? r rT.WQTPOP 
r ili/-y.N is 1. 1 1 n n 1 ^ •'H r Ti r*r K > 1 i » k ■'-V^ 7 Ll v rativa ixiiunoif yr 

N IS\R YYLRVI I WNTKDVILDEKS ITGEEMSDI YVKGWI PGNEE 

NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 

SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 

HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 

P CYAEKDGAR VMAG KVEMTLE I LNE KEADERPAGKGRDE PNMNP 

KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 

Lit V nv ±ji_J lull tr LM A JJO I'JIvX V IVc V 


6023 


102 


916 


S QELGMFVELNNLLNTTPDPJIEQGKLTLLCDAKTDGS FLVHHFL 
S FYLKANCKVCFVAL IQSFSHYS I VGQKLGVSLTMARERGQLVF 
LEGL/IVCSGR\VFQAQKEPHPLQFLREANAGNLKPLFEFVREA 
LKP VDSGEARWTYPVLLVDDLS VLLSLGMGAVAVLDF IHYCRAT 
VCWELKGNMWLVHDS GDAEDEEND I LLNGLSHQSHL I LRAEGL 
ATGFCRDVHGQLRILWRRPSQPAVHRDQSFTYQYKIQDKSVSFF 
AKGMSPAVL 


6024 


3 


3260 


FLS FLCYPRFRCLFCLQFAI PASRMEQLNELELLMEKS FWEEAE 
LPAELFQKKWASFPRTVXjSTGMDNRYLVLAVNTVQNKEGNCEK 
RLVITASQSLENKELCILRNDWCSVPVEPGDIIHLEGDCTSDTW 
1 1 DKD FG YL I L YPDML I SGTS I AS S I RCMRRAVLS ETFRS SD PA 
TRQML I GT VLHE VFQKAINNS FAPE KLQELAFQT I QE I RHLKEM 



. 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
Location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine r 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 




• 




YRLNLSQDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSIj 
PSDNSICDNSTCNI EWKPMDI EES I WS PRFGLKGKIDVTVGVKI 
HRGYKTKYKI MPLELKTGKESNS I EHRS QWLYTLLSQERRADP 
EAGLLLYLKTGQMYPVPANHLDKRELLKLRNQMAFSLFHRISKS 
ATRQKTQLASLPQIIEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
SVPIVMLPKIEEETQHLKQTHLEYFSLWCLMLTLESQSKDNKKN 
HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFQCKH 
GAI P VTNLMAGDRVI VSGEERS LFALSRG YVKE INMTTVTCLLD 
RNLS VLPES TLFRLDQEE KNCD I DTP EjGNLS KIiMENT FVS KKLR 
DLIIDFREPQFISYLSSVLPHDAKDTVACILKGLNKPQRQAMKK 
VLL SKD YTL I VGM PGTG KTTT I CTLVR I L YACG FS VLLTS YTHS 
AVDNILLKLAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 
KS\LALLEELYTSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 
ISQPICLGPLFFSRRFVLVGDHQQLPPLVLNREARALGMSESLF 
KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGGVSNVTEAKLI VFLTS I FVKAGCSPS DIG I IAP 
YRQQLKI INDLLARS IGMVE VNTVDKYQD\RDKS I VLVS FVRSN 
KDGTVGELLKDWRRLNVAITRAKHKLIIXGCVPSLNCYPPLEKL 
LNHLNS EKL 1 1 DLPSREHES LCH I LGDFQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 

ARYGEAGEGPGWGGAHPRICIiQPPPTSRTSFPPPRLPALEQGPG 

GLWWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIP 

AACGATLPALGLRSSAQDPQAVLGALGRALS pleewlrlht yla 

GEAPTLADLAAVTALLLPFRYVLDPPARRIWNNVTRWFVTGVRQ 

PEFRAVLGEWLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKR 

EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 

PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVS 

AANPRGVFMMC I P P PNVTGS LHLGHALTNA I QDS LTRWHRMRGE 

TTLWNPGCDHAGIArQVVVEKKLWREQGIiSRHQLGREAFLQEVW 

KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 

LHEEGI I YRSTRLVNWS CTLNSAI SD I E VDKKELTGRTLLS VPG 

YKEKVEFGVLVS FAYKVQGS DSDEE VWATTRI ETMLGDVAVAV 

H PKDTR Y QHL KGKNVIH PFLSRS LP I VFD E FVDMDFGTGAVK I T 

PAHDQND YE VGQRHGLEAI S I MDSRGAL INTVPPPFLGLPRFEAR 

KAVLVALKERGLFRG I EDNPMWPLCNRS KDWEPLLRPQWYVR 

CGEMAQAASAAVTRGDLRILPERHQRTWHAWMDNIRE\WCMFPG. 

KLWWG\ HR\ I PAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 

KAAKEFGVSPDKISLQQDEDVLDTWFSSGLFPLSILGWPNQSED 

LSVFYPGTLLETGHDILFFWVARMVMLGLKLTGRLPFREVYLHA 

IVRDAHGRKMSKSLGNVIDPLDVIYGISLQGLHNQLLNSNLDPS 

EVEKAKEGQKADFPAGIPECGTDALRFGIjCAYMSQGRDINLDVN 

R I LGYRHFCNKLWNATKFALRGLGKGFVPS PTSQPGGHESLVDR 

WIRSRL TEAVRLSNQG FQAYDFPAVTTAQ YSFWL YELCDVYLEC 

LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 

RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELALSITRA 

VRP\LRADYNLHPESGPTCFIiEVAD\EATGALASAVSGYVQGPG 

QAQVWAVAE PWGLPAP\QGCAVALASDRCS I \HLQLQG\LLDP 

ARELG\KLQ\AKRVEAQ\RQAQ\RLR\ERRA\ASGNPVKVPL\E 

VQEADEAKLQQTEAELRKVDEAIALFQKML 


6026 


2674 | 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAVDKKVDCPRLC 
TCE I RPWFTPRS I YMEASTVDCNDLGLLTFPARItPANTQILLLQ 
TNNIAKIEYSTDFPVNLTGLDLSQNNIjSSVTNINGKKMPQLLSV 

yleenkltelpekclselsnlqelyinhnllstispgafiglhn 
llrlhlnsnrlqminskwfdalpnleilmigenpiirikdmnfk 
pl inlrslviaginlte i pdnalvglenles i s fydnrli kvph 
valqkvvnlkfldlnknpinrirrgdfsnmlhlkelginnmpel 
isidslavdnlpdlrkieatnnprlsyihpnaffrlpkleslml 
nsnalsalyhgtieslpnlkeisihsnpircdcvirwmnmnktn 
irfmepdslfcvdppefqgqnvrqvhfrdmmeiclpliapesfp 
snlnveagsyvsfhcrata\epqpeiywitpsgqkllpnt\i*td 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid. segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histxdme, I=Isoleucine, K=Lysine, 
L=*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGGLYTCIATNLVGADLKSVMIKVDG 
SFPQDNNGSLNIKIRDIQANSVLVSWKASSKILKSSVKWTAFVK 
TENSHAAQSARIPSDVKVYKLTHLNPSTEYKICIDIPTIYQKNR 
KKCVNVTTKGLHPDQKEYEKNNTTTLMACLGGLLGI IGVICLIS 
CLSPEMNCDGGHSYVRNYLQKPTPALGELYPPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


5254 


4148 


GGRRAPGRPGRS I KDEEEETVFREWS FS PDPLPVRYYDKDTTK 
P I SFYLS SLEELLAWKPRLEDGFNVALEPLACRQPPLS SQRPRT 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TI PFVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAFIiAGDERSY 
QAVADRLVQIT\RFFRFDGWLINIENSLSLAAVGNMPPFLRYLT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTNYNWREEHLERMLGQAGERRADVYVGVDVFARGNWGGRFDT 
DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVALRNRCPAPAKLCPH 


6028 


120 

■ 


3432 j 


NCLIiLQAKGFHGEIEDLQQWLTDTERHLLASKPLGGLPETAKEQ 
LNVHMEVCAAFEAKEETYKSLMQKGQQMLARCPKSAETNIDQDI 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 
ELDKTGTHLKYFSQKQDWLIKNLLISVQSRWEKWQRLVERGR 
SLDDARKRAKQFHEAWS KLMEWLEESEKSLDSELE IANDPDKI K 
TQLAQHKE FQ KS LG AKHS VYDTTNRTGR S L KE KTS LADDNLKLD 
DMLSELRDKWDTICGKSVERQNKLEEA\LIiFSGQFTDALQALID 
WLYRVEPQLAEDQPVHGDI DLVMNLIDNHKAFQKELGKRTSSVQ 
ALKRSARE LI EGSRDDS S WVKVQMQELSTRWETVCALS IS KQTR 
LEAALRQAEE FHS WHALLEWLAEAEQTLRFHGVLPDDEDALRT 
L I DQHKEFMKKLEEKRAELNKATTMGDTVLAI CHPDS ITTI KHW 
ITI IRARFEEVIAWAKQHQQRLASAiAGLIAKQEIiLEALLAWLQ 
WAETTLTDKDKEVIPQE IEEVKALIAEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHI PVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWliLALERRRKLNDALDRLEELREFA 
NFDFD I WRKKYMRWMNHKKS RVMDFFRRIDKDQDGKI TRQE FID 
G I LSSKFPTSRLEMSAVAD I FDRDGDGYIDYYE FVAALHPNKDA 
YKP I TDADKI EDE VTRQVAKCKC AKRFQVEQ I GDNKYRFFLGNQ 
FGDSQQLRLVR I LRSTVMVRVGGG WMALDEFLVKNDP CRAKGRT 
' NMELREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAAS PQVPATTTP KI LHPLTRNYGBCP WLTNS KMSTPCKAA 
ECSDFPVPSAEGTPIQGSKIiRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 
IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AGIS QNAKTGDL PAFGECVG I AS KAL CGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP IQFARANQAIQMACQNLVDPGS S PSQVIiSAA 
T I VAKHTS ALCNACRI AS S KTANPVAKRH FVQS AKE VANSTANIi 
VKT I KALDGDFS EDNRNKCR I ATAPL I EAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSIAINPKDP 
PTWSVLAGHSHTVSDS IKSLITS IRDKAPGQRECDYS IDGINRC 
IRDIEQASLAAVSQSLATRDD ISVEALQEQLTSWQE IGHLIDP 
IATAARG EAAQLGHKGTQLAS YFEPL I LAAVG VAS K I LDHQQQM 
TVLDQTKTIAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDDIMVTLNEAASEVGLVGGMVDAIAEAMSKLDEGTPPEPKG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGLASQMTSD 
YGHLAFQGQMAAATAE PEE I G FQ I RTRVQDLGHGC I FLVQKAG \ 
ALQ VC PTDS YTKRE LI E CARAVTE ECv SLVLS ALQAGKK.GTQAC I 
TAATAVS G 1 1 ADLDTT I MFATAGTLNAENS ETFADHREN I LKTA 
KALVEDTKLLVSGAAS TPDKLAQAAQSSAATITQLAE WKLGAA 
S LGS DDPETQWL INAI KDVAKALSDLI S ATKG/^ASKP VDDPSM 
YQLKGAAKWVTNVTSLLKTVKAVEDEATRGTRTUjEATIECIKQ 
ELTVFQSKDVPEKTSSPEESIRMTKGITMATAKAVAAGNSCRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
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ID 
NO: 

■ 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDLLEHVLVILQKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPEDPTVIAETELLGAAAS IEAAAKKLEQLKPRAK 
PKQADETLDFEEQILEAAKSIAAATSALVKSASAAQRELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANAS VQGHA 
S EEKLIS SAKQVAAS TAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGGIAQI IAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6030 


3 


1777 

i • 


FPGRGSPALQLEVLICIX3LMGLERALNVLAPIFYRNIVNLLTEN 
APWNSLAWTVTSYVFLKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVELLI FSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
S YL VFNVT P TLADI 1 1 G 1 1 YFSM F FNAW FGLI VFLCMS L YLTLT 
IVVTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAI I KYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AY FVTEQ KLQVGDYVL FGTYI IQLYMPLNWFGTYYRMIQTNFID 
MENMFDLLKK\ETEVKDLPGAGPFRFQKGRIEFENVHFSYADGR 
ETLQDVSFTVMPGQTIiALVGPSGAGKSTILRLLFRFYDISSGCI 
R IDGQDI SQVTQALFRFSHWELC PKDTVLFNDT I ADNI RYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEKQR 
VAIARTILKAPGIILLDEATSALDTSNERAIOASLAKVCANRTT 
IWAHRLSTVVNADQILVIKDGCIVERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 

• 


1694 


LRM S ENLD KS NVNEAGKS KSNDS E EGLEDAVEG ADEALQ KA I KS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTNMALAHEIVVNG 
DFQIKPVELPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
IKLVGBIKETLLSFLLPGHTRLRNQITEVLDLDLIKQEAENGAL 
DISKIiAEFIIGMMGTLCAPARDEEVKKLKDIKErVPLFREIFSV 
LDLMKVDMANFAISSIRPHLMQQSVEYERKKFQEILERQPNSLD 
FVTQWLEEASEDLMTQKYKHALP VGGMAAGSGDMPRLS PVAVQN 
YAYLKLLKWDHLQRPFPETVLMDQS RFHELQLQ\ REQLTI LGAV 
LLVTFSMAAPGISSQADFAEKLKMIVKILLTDMHLPSFHLKDVL 
TTIGEKVCLEVSSCLSLCX3SSPFTTDKETVLKGQIQAVASPDDP 
IRRIMESRILTFLETYLASGHQKPLPTVPGGLSPVQRELEEVAI 
KFARLVNYNKMVFC PYYDAI LS K I LVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTI S E 
SCDRIKEEFQFLQAQYHSLKLECEpj^EKTEMQRHYVMYYEMS , 
YGLNI EMHKQAE I VKRLNAI CAQVI PFLSQEHQQQWQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLP VPLTPHPSGLQPPAI P 
PIGS SAGLLALSSALGGQSHLP I KDEKKHHDNDHQRDRDS I KSS 
SVSPSAS FRGAEKHRNSADYSSESKKQKTEEKE IAARYDSDGEK 
SDDNL WDVSNBDPSS PRGS PAHS PRENGL DKTRLLKKDAP ISP 
AS IAS SS STPS SKS KELSLNEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
MNGELTSPGAAYAGLHNISPQMSAAAAAAAAAAAYGRS PWGFD 
PHHHMRVPAI PPNLTGIPGGKPAYSFHVSADGQMQPVP FPPDAL 
IGPGIPRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTL 
S I WDLAAPTPRI KAELTSSAPACYALAIS PDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGAS C I DISNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 

W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM I RDLS KMYPQTRHPAPHQPAQP FKFTI S E 
S CDR I KE EFQ FLQAQYHSLKLECEKtiAS EKTEMQRHYVMYYEMS 
YGLNIEMHKQAEI VKRLNAI CAQVI PFLSQEHQQQVVQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P IGSSAGLLALSSALGGQSHLP I KDEKKHHDNDHQRDRDS IKS S 
S VS PSAS FRGAEKHRNSADYSS ES KKQKTEEKE IAARYDSDGEK 
SDDNLWDVSNEDPS S PRGS PAHS PRENGLD KTRLLKKDAP ISP 
AS IASSS STPS SKS KELSLNEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aepartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








MNGELTS PGAAYAGLHN I S PQMS AAAAAAAAAAAYGRS PWGFD 
PHHHMR VPAI P PNLTG I PGGKP AYS FHVS ADGQMQ P VP FP PDAL 
IGPGI PRHARQ INTLNHGE WCAVT I SNPTRHVYTGGKGCVKVW 
D I SHPGNKS P VS QLDCIiNRDNY I RSCRLLPDGRTL I VGGEASTL 
S I WDLAAPTPRIKAELTSSAPACYALAI S PDSKVCFS CCS DGN I 
AVWDLHNQTLVRQFQGHTDGAS CIDISNDGTKLWTGGLDNTVRS 
W\ DLREGRQLQQHD / FFTS P VFS LGYCP \ TEEWLAVGMENSN\ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


ESGRRRRLKRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM 
E I APQEAP PVPGADGD I E EAPAEAGS PS PAS PPADGRLKAAAKR 
VTFPSDED I VSGAVE PKDPWRHAQNVTVDE VIGAYKQACQKLNC 
RQIPKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEQTNLDEDGASALFDMI EY YESATHLNI S FNKHIGT 
RGWQAAAHMMRKTSCLQYIi\DARNrPLLDHSAPFVARALRIRSS 
LAVLHLENASLSGRPIjMLLATALKMNMNLRELYL\ADNKLNGLQ 
D S AOLGNLLKFNCSLQ I LDLRNNHVLD SGLAY I CEG LKEQRKGL 
VTL\ VLWNNQLTHTGMAFLGMTLPHTQS LETLNLGHNP I GNEGV 
RHLKNGLISNRSVLRLGIiASTKLTCEGAVAVAEFIAESPRLLRL 
DLRENEI KTGGLMALS LALKVNHS LLRLDLDRE PKKEAVKSF I E 
TQKALLAEIQNGCKRNLVIiAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLIiEASQESGQETL 


6035 


13 


404 


S VTYLG 1 1 LHKNTGALPADPVQL I SQTPTPSTKQQLLS FLGMVG 
YFYLWIPGFAILTKPLCKLTKENLADAIDPKSFSHSSFRSLKTA 
LENASTLALPDSSQPF\SLHTAEVQGCWEILTQGLGPLPV 


6036 


1745 


356 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGQGRGVEKPPHLAALILARGGSKGIPLKNIKHLAGVPLIGW 
VLRAALDSGAFQSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
SSTSLDAI I EFLNYHNEVDI VGN IQATS PCIiHPTDLQKVAEMIR 
EEGYDSVFSWRRHQFRWSEIQKGVREVTEPLNLNPAKRPRRQD 
WDGELYENGS FYFAKRHL I EMGYLQGGKMAY YEMRAEHSVD IDV 
DI DWP IAEQRVLRYGYFGKEKLKE I KLLVCNI DGCLTNGH I YVS 
GDQKE I IS YDVKDAIGI SLLKKSG I EVRL I SERACSKQTLS SLK 
LDCKMEVSVSDKLAWDEWRKEMGLCWKEVAYLGNEVSDEECLK 
RVGLS GAPADACS TAQKAVGY I CKCNGGRGA\ I R3 FAEHI C\LL 
ME KGL INFMPKNRNLAVN I GEKK 


6037 


2936 


1919 


WTSWWMSSVLTILLFSLQGNKMLNYSAPSAGGYLLPRKPVGTPA 
GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\EL CRP FEENGACKYG 
DKCQFAHG IHELRSLTRHP KYKTELCRT FHT IGFCP YGPRCHFI 
HNAEE RRALAGARDLS ADR P RLQH S FS FAGF PS AAATAAATGLL 
DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQELASLFA 
PSMGLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSIiSDQEGYLS 
SSSSSHSGSDSPTLDNSRRLPIFSRLSISDD 


6038 

V V *J w 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQIISCNICQIiRFNSDSQAAAH 
YKGTKKAKKLKALEAMKNKQKSVTAKDSAKTTFTS ITTNTINTS 
SDKTDGTAGTPAISTTTTVEIRKSSVMTTEITSKVEKSPTTATG 
NSSCPSTETEEEKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGT I KAFPRAG VKG KGPVNKGNTGLQNKT FHCE I CD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSSPFSLRTAP 

AATLFQTSALPPALLRPAPGP I RTAHTPVL FAPY 


6039 


4073 


1000 


LDE YEARLTLANLDD FEEDNEDDDENRVNQEEKAAK I TELINKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDS EE P I TETAS P 
RKTEDSFYNNSYNPFKEVQTPQYLNPFDEPEAFVTIKDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VNPVQELETERRVKRKAPAPPVLSPKTGVIjNENTVSAGKDLSTS 
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ID 
NO: 


"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to fxrst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


• 


• 

1 




PKPS PI PS PVLGRKPNASQS LLVWCKEVTKNYRGVK ITNFTTS W 

RNGLSFCAI LHHFRPDL I DYKS LNPQD I KENNKKAYDGFAS I G I 

SRLLEPSDMYLLAIPDKLTVMTYLYQIRAHFSGQELNWQIEEN 

SSKSTYKVGNYETDTNSSVDQEKFYAELSDLKREPELQQPISGA 

VDFLSQDDS VFVNDSGVGESE S EHQTPDDHLSPSTAS P YCRRTK 

SDTE PQKSQQSSGRTSGSDDPG I CSNTDSTQAQVLLGKKRLLKA 

ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 

LENSRSLECRSDPES P I KKTS LS PTS KLGYS YSRDLDLAKKKHA 

SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 

ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 

DTNEEI PEGFWGGGDELTNIiENDLDTPEQNSKLVDLKLKKLLE 

VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 

RNPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 

EEKAAITETQRKPSEDEVLNKGFKDS\SQYWGELAAIiENEQKQ 

IDTRAALVEKRLRYTjMDTGRNTEEEEAMMQEWFMLVNKKNALIR 

RMNQLSLLEKEHDLERRYELLNRELRAMLAIEDWQKTEAQKRRE 

QLLI^ELVALVWKRDALVRDLDAQEKQAEEEDEHLERTLEQ2«CG 
KMAKKEE KCVLQ 


6040 


475 


1052 


ptalmtapscafpvqfrqpsvsglsqitkslyisngvaannklm 
lssnqitmvinvsvewntlyediqymqvpvadspnsrlcdffd 
piadhihsvemkqgrXtllhcaagvsrsaalclaylmkyhamsl 
iidahtwtkscrpiirpnsgfweqlihyefqlfgkntvhmvsspv 
gmipdiyekevrlmipl 


6041 

i J 


2 

* 1 


3886 

• 


TEKDEKTAHNLENVLIHFWERLSE I CVAKISEPEADVES VLGVS 

NLLQVLQKPKGSLKSSKKKNGKVRFADEILESNKENEKCVSSEG 

EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 

VNERKSEQHLRFLSTLLDSFSSSRVFKMLLGDEKQSIVQAKPLE 

I AKLVQKNPAVQFLYQKL IGWLNEDQRKDFGFLVD I LYSALRCC 

DNDMERKKVLDDLTKVDLKWNSLLKI IEKACPSSDKHALVTPWL 

KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTLLSLVLSQ 

HVKNDYIiIGDVYVERIIVRLHETIjFKTKICLSEABSSDSSVSFIC 

DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI . 

CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 

DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 

MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTIiPSHLCT 

SALLSKMVL IALRKETVLENNELE KI IAELLYSLQWCEEIiDNPP 

I FL I GFCE I liQKMNI TYDNLRVLGNMSGLLQIjLFNRSREHGTLW ' 

SIiIIAKIiILSRSISSDEVKPHYKRKESFFPLTEGNLHTIQSLCP 

FLS KEEKKEFSAQCI PALLGWTKKDLCSTNGGFGHLAI FNSCLQ 

TKSIDDGELLHGILKIIISWKKEHEDIFLFSCNLSEASPEVLGV 

NIEIIRFLSLFLKYCSSPLAESEWDFIMCSMLAWLETTSENQAL 

YSIPLVQLFACVSCDLACDLSAFFDSTTLDTIGNLPVNLISEWK 

EFFSQGIHS LLLPILVTVTGENKDVS ETS FQNAMLKPMCETLTY 

ISKEQLLSHKLPARLVADQKTNLPEYLQTLLNTLAPLLLFRARP 

VQIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 

LLSIQEDLLENVLGCIPVGQIVTIKPIjSEDFCYVIjGYLLTWKLI 

LT FFKAAS S QLRALYSMYLRKTKS LNKLLYHLFRLMPENPT YAE 

TAVEVPNKDPKTFFTEELQLSIRETTMLPYHIPHLACSVYHMTL 

KDLPAMVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQEISSVQT 

STQLFNGMTVKARATTREVMATYTIEDIVIELIIQLPSNYPLGS 

IIVESGKRVGVAVQQWRNWMLQLSTYLTHQNGSIMEGLALWKNN 

VDKRFEGVEDCMICFSVIHGFNYSLPKKACRTCKKKFHSA\CLY 

KWFTSSNKSTCSLCRETFF 


6042 


13 06 


253 


MAEI*APASPSD I KAS VSNGDTTLLCSRRQSCGMNEVRQVSLTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQRAPGGLSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 
LVDKNIDRFIPITKLKYYFAVDTMYVGRKLGLLFFPYLHQDWEV 
QYQQDTPVAPRFDVNAPDLYI PAMAFI TYVLVAGLALGTQDRFS 
PDLLGLQASSALAWIjTLEVLAI IJjS LYLVTVNTDXjTTI DLVAFL 
G YKYVGM I GG VLMGLLFG KI G Y YLVLGWCCVA I F VFM I RTLRLK 
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ID 
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beginning 

nucleotide 

location 

oui x ti apuimiiiy 

to first 

amino ar»i H 

CI. til A. UVJ a l_ JL VA 

residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponuxiiy 

LU LXIoL 

amino acid 

T*P c ■{ r^up nf 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 

W — Ui Qt* i /) i no T «*• T cnl ai ir*i no IT— T ,Tre t t^o 
n-nxotJ.ui.iic , i-iouxcuLiuc, J\.— uy siiic , 

Ti-TiPUPinp M-MphhionTTiP "NT— AQriSTacH np 

P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine T=Th.reonine V=Valine 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=DOssible nucleotide insertion) 








ILADAAAEGVPVRGARNQLRMYLTMAVAAAQPMLMYWLTFHLVR 


6043 


403 


599 


LCLFFPFPCATPVLPLPSLISAL/CLSHLSVSSWFCPCQPPLPC 
PLPPLQNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTLISKVKISREVTMIASKFGIGQQVRHSLMYLGVVV 
DIDPVYSLSEPSPDEIAVNDELRAAPWYHVVMEDDNGLPVHTYL 
AEAQLSSELQDEHP\EQPSMDELAQTIRKQIiQAPRLRN 


6045 


155 

* 


2299 


S PLPQ VAAMNYLRRRLS DSN FMANLPNG YMTDLQRPQP P P P P PG 
AHS PGATPGPGTATAERSSGVAPAASPAAPS PGSSGGGGFFS SL 
SNAVKQTTAAA7\ATFSEQVGGGSGGAGRGGAASRVLLVIDEPHT 
DWAKYFKGKKIHGB I DI KVEQAEFSDLNIiVAHANGGFS VDME VL 
RNGVKWRSLKPDFVL I RQHAFSMARNGDYRSLVIGLQYAGI PS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLS S \TTYPVWKMGHGTLWGWGKVKVDNQHDFQDI ASVVALT 
KT YATAE PF I DAKYDVRVQKI GQNYKAYMRTS VSGNWKTNTGS A 
MLEQ I AMSDRYKLWVDTCSE I FGGLDI CAVEALHG KDGRDHI I E 

WGSSMPLIGDHQDEDKQLIVELWNKMAQALPRQRQRDASPGR 

r'CXirTiT'DQ Dr*AT dt njATcnnDiPDDBn^uODDnftirDDnDnDrD 
laorlvjy L c o Fvjf>\j_jc' JjvjKv * ^ y y irAva Jr fiiWy K. Jr r* Jr yvnj Jr xr v " 

QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
ASQAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


107b 


TT/-*T T'PnPC'BUnDT T /-tT)/^ t5 DUr^Ty TT3 > PUDD 7\\7D fJ 7\ /T1CO T DDT TUB 

hiVaJbiolrCJlKVt'r Ijl/jKljJrFrlULAXKAwiKKAVKW 

SLIMDSPRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPLS PGLPAMGGPGPGP CEDPAGAGGAGAGGSEPLVTVTVQCA 
FTVALRARRGADLSSLRALIiGQALPHQ\AQLGQLSYLAPGEDGH 
WVPIPEEESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAQGPEDl^FRQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 

* 


49 

• 


1405 

* 


PVLVTSLRMREADTLRPPQLMEVSADI ISTVEFNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDEEGKLKDLS TVTSLQVP VLKPMDLMVEVSPRR I FANGHTYH 

NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRIiCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL\NMEARPIETYQVHDYLRSKLCSLYEXJDCIFDKFECA 
WNGSDSVIMTGA\YNNFFRMFDRNTKRDVTL\EASRESSKPRAV 
LKPRRVCVGGKRRRDDI S VDSLDFTKKI LHTAWHPAEN I IAIAA 
TNNLY I FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTSNSSKTRAGANSKGRRGSQNSSEHRPPASSTSEDVKASPSS 
ANKRKNKP LSDME LNSSSEDS KGSKRVRTNSMGSATGPLPGTKV 
EPTVLDRNCPSPVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 
S KPEADGDSEYGEE P I LHADLGS CNG \ AS VSQK\GS LS PARS AT 
PKVRLVEPHSPSPSSKFSTKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARP I / APIAI P PQQI YTFQTATFTAAS PGSSSGLTATVAQAMP 
NS PQLKPIQPKPTVMGEPFTVNPALTPAXDKKKKDKKKKES SKE 
LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLLNGSSDPHQSR 
LASI KAEADKI YS FTDNAPSPSIGGSSRLENTTPTQPLTPLHW 
TQNGAEASSVKTNS PAYSDISDAGEDGEGKVDSVKS KDAEQLVK 
VGAKTC TTjFPPOPO QlCDSPYYOflPRCtYY*? P55 YAOSS PGALNP S SO 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
IQQRPNMYMQSLYYNQYAYVPPYGYSDQSYHTHLIiSTNTAYRQQ 
YEEQQKRQSLEQQQRGVDKKAEMGLKEREAALKEEWKQKPSIPP 
TLTKAPSLTDLVKSGPGKAKE PGADPAKS VI I PKLDDS S KLPGQ 
APEGLKVKLSDASHLSKEASEAKTGAECGRQAEMDP I LWYRQEA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 
1 location 

corresponding 

to first 
1 amino acid 
j residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQS~ 

IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

S PSQRLMSTHHHHHHLGYSLLPAQYNLP YAAGLSSTAI VASQQG 
STPSLYPPPRR 


6049 


215 


| 1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQESPTLPESSAT 
DSDYYS PTGGAPHGYCS PTSAS YG\KALNP YQYQYHGVNGSAGS 
YPAKAYADYS YASS YHQYGGAYNRVPSATNQPEKEVTE PEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


6050 


566 


1718 


KGLERTC<^EESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL " 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSIiVDSVGNTKTFT>VEHS}nn?FLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVE VLI YVFDVESRELEKDMHY 
YQSCLEAI LQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS I WDETLYKAWS S I VYQL I PNVQQLEMNLRN 
FAEI IEABEVLLFERATFLVISHYQCKEQRDAHRFEKISNI IKQ 
FKLSCSKLAAS FQSMEVRNSNFAAFIDI FTSNTYVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6051 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
R I HSLQ INS SLSTYS LVDS VGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS IWDETLYKAWSS IVYQL I PNVQQLEMNLRN 
FAE 1 1 EADEVLLFERATFL VI SHYQCKEQRDAHRFEKI SNIIKQ 
FKLSCSKLAAS FQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6052 

* 


• 566 


1718 

■ 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\6SLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAILQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS IWDETLYKAWSS I VYQLI PNVQQLEMNLRN 
FAEI IEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI IKQ 
FKLSCSKLAAS FQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 | 

1 


1704 


KGTEMNKSRWQSRRRHGRRS HQQNP W FRLRDSEDRSDSRAAQPA 
HDSGHGDDESPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSALASD 
RFNLILADTNSDRLFTVNDVTVGGSKYGI INLQSLKTPTLKVFM 
HENLYFTNRKV\NS VCWASLNHLDSHI LLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRR VLLTNWTGHRQS FGTNS DVLAQQFALMAPLLFNGCRS 
GEIFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKI KLWDLRTTKCVRQYBGHVNEYAYLPLHVHEEEG I LVAVG 
QDCYTR I WSLHDARLLRTI PS P Y PASKAD I PSVAFS SRLGGS RG 
APGLLMAVGQDLYCYSYS 


6054 


1 


1054 


PPIARLQEFGTSRRHMAAPSGVHLLVRRGSHRIFSSPLNHIYLH 
KQSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 
VTTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEE I DALVHRE 1 1 SHNAYPS PLGYGGFP KSVCTS VNNVLCH 
GI PDSRPLQDGDI INIDVTVYYNGYHGDTSETFLVGNVDECGKK 
LVE VARRCRDEAIAACRAGAP FS VI GNT I SH I THQNG FQ VCPHF 
VGHG IGS YFHGHPE I WHHANDSDLPMEEGMAFT I EP I ITEGS PE 
FKVLEDAWTWSLD/TSKVSAQFEHTVLITSRGAQILTKLPHEA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K« Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 

• 


2364 


P P YFLLS FLAWWLYGQSDRTETD I S QS AGP P PGTLQ CS ALHHDP 
GCANCSRFCRDCSPPACQCHTHVFPGNALNGVQPPELSRTLALI 
SSREPPRKKKKSQTETGKERERTSFLTQGGKRFELQHGIiAGICM 
TLLITGDS I VSAEAVWDHVTMANRELAFKAGDVIKVLDASNKDW 
WWGQIDDEEGWFPASFVRLWVNHEDEVEEGPSDVQNGHLDPNSD 
CLCLGRPLQNRDQMRANVTNE IMSTERHYIKHLKDICEGYLKQC 
RKRRDMFSDEQLKVI FGNI EDI YRFQMGFVRDLEKQYNNDDPHL 
BE I GP CFLEHQDG FW I YS E YCNNHLDACMELS KLWKDS R YQHFF 
EACRLLQQM ID I A\ I DGFLLTPVQKI CKYPLQLAELLKYTAQDH 
SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 
GED I LDRS S EL I YTGEMAW I YQP \ YGRNQQRVFFLFDHQMVTjCK 
KDL I RRD IL YYKGRIDMDKYEWD I EDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLNHGQYLVP 
\DGI AQSQVFEFTEPKRSQS PFWQNFSRLTPFKK 


6056 


43 

■ 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSIiPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAYME SKGAHRAGLAKVI P P KE WKPRQCYDD IDNLLI PAP I QQ 
MVTGQS GLFTQ YNI QKKAMTVKE FRQLANSGKYCTPR YLD YEDL 
ERKYWKNLTFVAP I YGADINGS I YDEGVDEWNI ARLNTVLDWE 
EECGISI EGVNTP YLYFGMWKTTFAWHTEDMDLYS INYLHFGEP 
KS WYAI P PEHGKRLERLAQGFFPSS SQGCDAFLRHKMTLI S PS V 
LKKYGIPFDKITQEAGEFMITFPYGYHAGFNHGFNCAESTNFAT 
VRWIDYGKVAKLCTCRKDMVKISMDI FVRKFQPDRYQLWKQGKD 
IYTIDHTKPTPASTPEVKAWLQRRRKVRKASRSFQCARSTSKRP 
KADEEEEVSDEVDGAEVPNPDSVTDDLKVSEKSEAAVKLRNTEA 
SSEEE SS ASRMQVEQNLSDH I KLSGNSCLSTSVTEDI KTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
S VAE SNGVLTEGEE S DVE SHGNGLB PGE I PAVP S GERNS FKVP S 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPEVLSI 
EEEVEETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMPYHKPDSSNEENDARWETKLDEWTSEGKTKPLIPEMCF 
IYSEENIEYSPPNAFLEEDGTSLLISCAKCCVRVHASCYGIPSH 
E I CDGW L CARC KRNAWT AE C C L CNLRGGALKQT KNN KW AHVM CA 
VAVPEVRFTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVSGA 
C IQCS YGRCPASFHVTCAHAAGVL\MEPDDWPYWN I TCFRHKV 
NPNVKS KACEKVI S VGQTV ITKHRNTRYYSCRVMAVTSQTFYEV 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCQVNS LS S PHVSQAQQETYLGFW INS KKSQCN I F 
LSGTY 


6057 


1 


853 


FVARLKEQEGEGGLGPRKEKGRARGRBRRRKMQLTRCCFVFLVQ 
GS LYLVI CGQDDG P PG S ED PERDDHEGQ PRPRVPRKRGH I S P KS 
RPMANSTLLGLLAPPGEAWGILGQPPNRPNHSPPPSAKVKKIFG 
WGDFYSN I KTVALNLLVTGKI VDHGNGTFS VHFQHNATGQGNI S 
ISLVPPSKAVEFHQEQQIFIEAKASKIFNC\RMEWEKVE\RGRR 
TSLFTHDPAKI CSRDHAQ SSATWS CSQPFKWCVYI AFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


H PLP S AS LGL P S VS LGVS LC VRSALLE AWPML P KRRRAR VGS P 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALLD 
I S W LTES LGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACQR 
PTPLTHHNTGLS EALE I LAEAAG FEG SEGRLLTFCRAAS VLKAL 
PS P VTTLS QLQGLPHFGEHS SRWQELLEHGVCEEVERVRRSE / 
RLFTQIFGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGEP 
SREAGPWASLNCTLDPSASTP 


6059 


2 


! 3650 


QQDFESLADLTDHRAHRCPGDGDDDPQLSWVASSPSSKDVASPT 
QM IGDGCDLGLGEEEGGTGLP YP CQFCDKS FI RLS YLKRHEQ I H 
S D KLP FKCTYC SRLFKHKRS RDRH I KLHTGDKKYHCHECE AAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
EHLAKSEKEAKKDDFMCDYCEDTFSQTEELEKHVLTRHPQLSEK 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of - 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, ' 
L=Leucine, M=Methionine, N»Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
SaSerine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








adlqcihcpevfvdentllahihqahanqkhkcpmcpe\qfssv~ 

\egvychldshrqpdssnhsvspdpvlgsvasmssatpdssasv 

ergstpdstlkplrgqkkmrddgqgwtkwyscpycskrdfnsl 

avleihlktihadkpqqshtcqicldsmptlynlnehvrkiihkn 

haypvmqfgnisafhcnycpemfadinslqehirvshcgpnanp 

sdgnnaffotqcsmgfltessltehiq\q\ahcsvgsaklespv 

vqptqsfmevyscpyctns pi fgs ilkltkhikenhkniplahs 

kkskaeqspvssdvevsspkrqrlsasansisngeypcnqcdlk 

fsnfesfqthlklhlelllrkqacpqckedfdsqesllqhltvh 

ymttsthyvcescdkqfssvdd\lqkh\liidmphplccthct\l 

cqevfds\kvsi\qvh1avkhsnekkmyrctacnwdfrkeadlq 

vhvkhshlgnpakahkcifcgetfstevelqchitthskkynck 

fcskafhaiillekhlrekhcvfdaatengtangvppmatkkae 

padlqgmllknpeapnsheaseddvdasepmygcdicgaaytme 

vllqnhrlrdhnirpgeddgsrkkae fikgshkcnvcsrtffse 

nglrehlqthrgpakhymcpicgerfpslltltehkvthsksld 

tgtcriocmplqseeefiehcx3mhpdlrnsltgfrcotcmqtvt 

STLELKIHGTFHMQKLAGS SAAS S PNGQGLQKLYKCALCLKEFR 
S KQDLVKLD VNGLP YGLCAGCMARS ANGQVGGLAPPE P ADR PCA 
GLRCPECSVKFES AEDLE S HMQVDHRDLTPETSG PRKGTQTS PV 
PRKKTYQCIKCQMTFENEREIQIHVANHMIEEGINHECKLCNQM 
FDSPAKLLCHLIEHSFBGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 

j 

T 


202 

* 


SYEIVGKNKLEVNHSQLKALCKCSLPSRLLPLGENLPLLDRGFR 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
DISASRPNILLLMADDLGIGDIGCYGNNTWRTPNIDRLAEDGVK 
LTQHI SAASLCTPSRAAFLTGRYPVRSGMVS S I GYRVLQWTGAS 
GGLPTNETTFAKILEEKGYATGLIGKWHLGLNCESASDHCHHPL 
HHGFDHF YGMP FS LMGDCAR WE LS E KR VNLEQKLNFLFQVLALV 
ALTLVAG KLTHL I P VSWMP VI WS ALS AVLLLAS S YFVGAL I VHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 
SFLHVHIPLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNSTLIYFTSDHGGSLENQLGNTQYGGWNGIYKGGKGMGGW 
EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTVVRLAGSEVP 
QDRVIDGQDLLPLLLGTAQHSDHEFLMHYCERFLHAARWHQRDR 
GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKVVHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMER\VQQAVWEHQRTLSPVPLQ 
LDRLGNIWRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


1330 


PWIHMKRKTIKNINTFENRMLMLDGMPAVRVKTELLESEQGSPN 
VHNYPDMEAVPLLLNNVKGEPPEDSLSVDHFQTQTEPVDLS INK 
ARTS PTAVS SSPVSMTASAS S PS STSTS SSSSSRLAS S PTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
SNKLSHVHRIPVWQSVPWYTAVRSPGNVNNTIWPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTALSIA 
RAVQEVHPS PVSRVRGNRMNNQKFPCS I S PFSIESTRRQRTVLN 
PPDSRKTAYSTDCDF\ EGLQQKLYTKS S S PGRVHRRTHTGE kp y 
KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH ! 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT ~~ 
LI VLFWGS KHFWPEVPKKAYDMEHTFYSNGEKKKI YME I DPVTR 
TE I FRSGNGTDETLE VHDFKNG YTGI YFVGLQKCF I KTQI KVI P 
EFSEPEEEIDENEEITl'TFFEQSVIWVPAEKPIENRDFLKNSKI 
LE I CDNVTM YW\ INPTL\ ISGTFAKQLHHNFAFI ILVSELQDFE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
IND YTENG I EFDPMLDERG YCCI YCRRGNR YCRRVCE PLLG YYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNG P ENCE DCH I LNAEAF KS KKI C KS LKI CGLVFG I LALT 
L I VLFWG S KHFW PE VP KKA YDME HT F YS NGEKKKI YM E I D P VTR 
TE I FRSGNGTDETLEVHDFKNG YTGI YFVGLQKCF I KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LE I CDNVTM YW\ INPTL\ I SGTFAKQLHHNFAFI ILVSELQDFE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ox 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 

U-.Ui ot- 1 /^i no T — Tertl f "i no XT T .ire ^ no 

ri^riiy Liaine , i~isoicuuxnc , a.— ijy oJLUc , 

L=Leucine, M=Methionine, N=Asparagine, 
ps=Proline, Q=Glutaiuine, R=Arginine, 

C-Cor4 r p_ r pV\ v*cif~^Tl i tip V7— TTa T i no 
O-OCXlIlC/ 1 — 1 Ili. cUIl J.I1C , v — Vcl X JL.IJ.C , 

WMTryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGEDLHFPANEKKGIEQNEQtiWPQVKVEKTRHARQASEEELP 
YP YCYQGGRVTCRVT MPCNWWVARMLGRV 


6064 


913 


311 


NLPQSLPRPTEHSPPYSLEKMTDLVAVWDVALSDGVHKIEFEHG 
TTSGKRVVYVDGKEE IRKEWMFKLVGKETFYVGAAKTKATINID 

LEKDAMDWCNGKKLETAGE FVDDGTETHFS IGTH\ACYIKAV\ 
S SG \ KRKEG I IHTL I VDNRE I PE IAS 


6065 


1153 


641 


MSVRVARVAWVRGLGAS YRRGAS S F PVP P PGAQGVAELLRDATG 
AEEEAPWAATERRMPGQCSVLLFPGQGSQVVGMGRGLLNYPRVR 
ELYAAARRVLGYDLLELSLHGPQETUDRTVHCQPAIFVASLiAAV 
EKLHHLQPSVIENCVAAAGFSVGEFAALVFAGAMEFAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCX3SDGDVRIW 
EDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDGILTRFTTNANHWFNGDGTKIAAGSSD\FLVKIVDVMDSS 
QQKTFRGHDAPVLSLS FDPKDI FLASASCDGSVRVWQI SDQTCA 
I SWPLIiQKCNDVINAKS I CRLAWQP KSGKLLAI PVEKS VKLYRR 
ESWSHQFDLSDNFISQTLNIVTWSPCGQYLAAGSINGLIIVWNV 
ETKDCMERVKHEKGYAICX3LAWHPTCGRISYTDAEGNLGLLENV 
CD P SG KTSS S 1WS SRVE KD YNDLFDGDDM SNAGD FLNDNAVE I P 
SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 
GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS IG 1 1 RCYNDEQDNAI DVE FHDTS IHHAT 
HLSNTLNYT I ADLSHE AI LIiACE STDE LAS KLHCLHFS S WDS S K 
EWI IDLPQNEDI EAI CLGQGWAAAATS ALLLRL FT IGGVQKEVF 
SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 
QILHGDPLPLTRKSY1AWIGFSAEGTPCYVDSEGIVRMLNRGLG 
NTWTP I CNTREHCKGKSDHYWWG I HENPQQLRC I PCKGSRFP P 
TLPRPAVAILS FKLPYCQ I ATEKGQME EQFWRSVI FHNHIiDYLA 
KNGYEYEESTKNQATKEQQELLMKMLALSCKLEREFRCVELADL 
MTQNAVNLAIKYASRSRKL I LiAQKljSELAVEKAAELTATQVEEE 
EEE E DFRKKIjNAG YSNTATE WS QPR FRNQ VahtUAJLU^iyKADDha 
KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 
SKEPAMSMNSARSTNILDNMGKSSKKSTALSRTTNNEKSPIIKP 
LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPQNTENQRPKTGFQMWLEENRSNILSDNPDFSDEADIIKEGM 
IRr KVLib 1 bhRKVWANKAKGt* I AbfcAj l bAKivXlUtV VL/EioUiS 1 HJN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


6067 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKTAUjQDGRRKVHYLF 
PDGKEMAEEYDEKTS ELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 
SVDQKERCI I VRTTNKKYYKKFS I PDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAI AP PVFVFQKDKGQKS PAEQKNLS DSGEEP 
RGEAEAPHHGTGHPESAGEHALEP PAPAGASASTP PPPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPSSGTNGVSLPADCTGAVPAASPDTAAW 
RSPSEAADEVCALEEKEPQKNESSNASEEEACEKKDPATQQAFV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 

e>T 13111 C?rnvrO 71 T*\7\ O PMV WTO/*' AWM C t?t>TrT PDDtTT TvTTT'WO C T""l7\ TVTT5 C*M 

S LENSTNS ADASSKKF Vr GQrJ M a £>K VLo F F Klil V is o U/UM X &M 
AAES GS E S S S QEAT P E KES LAE S AAAYTKATAR KCLLE KVE VI T 
GEEAESNVLQMQCKLFVFDKTSQSWVERGRGLIiRIlND^4ASTDDG 
TT ^ 9 R T i^D AGPR ^ T .P \ L T LNTIH.WAOMO I DKASEK\ S I R I TAM 
DNEDQGVKVFLISASSKDTGQVYAALHHRILALRSRVEQEQEAK 
MPAPEPGAAPSNEEDDSDDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


583' 


27 


PTRPGQAGS S SAMAAQRLGKRVLS KLQSPSRARGPGGS PGGLQK 
RHARVTVKYDRRELQRRLDVEKWIDGRliEELYRGMEADMPDEIN 
IDELLELESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M«Methionine, N=Asparagine , 
P«Prolxne, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q \ PGLRQPS PS P \DGQPS APFQG PGARTAS PLTIiLALF PGP PER 
RPALLCVLSCI 


6070 


478 


858 


IRVTVDGEFLHYIFPLQFLDSPEW/RFTETHRGRHF\QVTLTAE 
TDCRYVSWRRKKLYLLFAQHRYI SRLFSVLIGSDIADKLYALND 
RVYIGKRYHYDI RLPNFYQMSTPE IRRSPLTQHFQNSRRYW 


6071 


2 

■ 


1654 


HEARTKGNMALARP \VRL FSLVTRLLLAPRRGLTVRSPDE PLPV 
VRIPVAIiQRQLEQRQSRRRNLPRPVLVRPGPLLVSARRPELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFS I ERAQQEAPAVRKLS 
SKGSFADLGAWKPRVLHALQE\AAPEVVQ\PTTVQSSTIPSLLR 
GRHWCAAETGSGKTLSYLLPLLQRLLG\HPSLDSLPIPAPRGL 
VLVPSREIiAQQVRAVAQPLGRSLGLLVRDLEGGHGMRRIRLQLS 
RQPSADVLVATPGALWKALKSRL ISLEQLS FLVLDEADTLLDES 
FLELVDYILE KSH IAEGPADLEDPFNPKAQLVLVGATFPEGVGQ 
LLNKVAS PDAVTTITSSKLHCIMPHVKQTFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWLGYILDDHKIQHLRL 
QGQMPALMRVGI FQSFQKSSRDILLCTDIASRGLDSTGVELWN 
YDFP PTLQDY IHRAGRVGRVGSEVPGTVISFVTHPWDVSLVQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 

• 


742 


KMERTEMMPT I NS QLE F KS KP FPLVS S S RWL VKRGELT AYVEDT 
VLFSRRTSKQQVYFFLFNDVLIITKKKSEESYNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLFTLTVLSNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRS FT 
AKQ PDELS LQVAD WL I \ YQRVSDGWYEGER\LRDGERGWFPME 
CAKEITCQATIDKNVERMGRLLGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/SILVHCAVGVSRSATLVIiAYLMIjYHHLT 
L VEA I KKVKDHRG 1 1 PNRGFLRQLIiALDRRLRQGLEA 


6074 

• 


168 

- 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRAS RAKV ILLTG YAHS S L PAELDS GACGGS S LNS EGNS GS G 
DSSS YDAPAGNS FLEDCELSRQIGAQLKHiPMNDQIRELQTI IR 
DKTASRGDFMFSADRLIRLWEEGLNQLPYKECMVTTPTGYKYE 
GVKFEKGNCGVSIMRSGEAMEQGLRDCCRSIRIGKILIQSDEET 
QRAKVY YAK FP P D I YRRKVLLM YP I LQTG \ NTV I EAVKVL I EHG 
VQPSVI ILLSLFS TPHGAKSI IQEFPEITI LTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


PPTCQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE I ERAECT I RMNDAPTTG YSADVGNKTTYRWAHS S VFRV 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGDTVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFI TEKRVFSS WAQLYG I TFSHPS WT 


6076 


1721 


107 


HPSPTEAPRVQHLTMDCTWRILFLVAAATGTHAQVQLVQSGAEV 

KKPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 

GETIYAQKFQGRVTMTEDTSTDTAYMELSSLRSEDTAVYYCATD 

HGD YAFD I WGQGTMVTVSSAPTKAPDVFP I ISGCRHPKDNSPW 

LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 

SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 

PTAQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 

TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 

AHLTWEVAGKVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 

GTSVTCTLNHPSLPPQRLMALREPAAQAPVKLSLNLLASSDPPE 

A\ASWLLCEVSGFSPPNILLMWLEDHGEVNTSGFAPARPLPKP\ " 

RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 

EVSYVTDHGPMK 




JOO / 


1268 

- 


LLPDMNLQPI FWIGLI SSVCCVFAQTDENRCLKANAKSCGECIQ 
AG PNCGW CTNS TFLQEGMPTSARCDDLEALKKKGCPPDD I ENPR 
GSKDIKKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGEP 
QTFTLKFKRAEDYPIDLYYIJ4\DLSYSMKDDLENVKSLGTDLMN 
EMRR I TSDFR IG FGS F VE KTVM P Y I S TTPAKLRNPCTS EQNCTS 
PFSYKNVLSLTNKGEVFNELVGKQRISGNLDSPEGGFDAIMQVA 
VCGS L I GWRNVTRLL VFS TDAG FHFAGDGKLGG I VLPNDGQCHL 
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SEQ 1 
ID 

NO: | 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide | 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

L=Leucine, M=Methionine, N=Asparagine, j 
P=Proline, Q=Glutamine r R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, i 
\=possible nucleotide insertion} 








ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKE | 
IjKNLIPKSAVGTLSANSSNVIQLIIDAYNSIiSSEVILENGKLSE 
GVT I S YQS Y\ CKNGVNGTGBNGRKCSNI S I GDEVQFE I S ITSNK 
CPKKDSDSFKIRPLGFTEEVEVILQYICECECQSEGIPESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEVNSEDIGCFTARKENQ 
J? yKoAoJMHtjK v foAvjyL. v L-KiVxCUW l via ± x ovjj\.c r a u/no j 
NGL I CGGNGVCKCRVCECNPNYTGS ACDCSLDTSTCE ASNGQ I C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA j 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHWENPECPTGPDI IP IVAGWAG 
I VL IGLALLLI WKLLM I IHDRRE FAKFEKEKMNAKWDTGENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCP I CCSLFDDPRVLPCSHNFCKKCLEGILE ] 

GSVRNSLWRPVPFKCPTCRKKTFSYWELIPLQVNYSLKGIVEKY 

NKIKISPKMPVCKGH\LGQPLNIF\CIj\TDMQLDIi/CGIC\ATR 

GKH.1 KH Vr Co XciUAiAyiiKiJlftJr tti>ljc\iOrCt l vjkkuij/iuojk.ijl' i u j 

ETSKRKSLQLLTKDSDKVICEFFEKIjQHTIjDQKKNEILSDFETMK 1 

LAVMQAYDPEINKLNTILQEQRMAFNIAEAFKDVSEPIVFLQQM 

QEFREKIKVIKETPLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 

LSLPQDTGTFISKIPWSFYKLFLLIIjLLGIiVIVFGPTMFLEWSL 

FDDIiATWKGCLSNFSSYLTKTADFIEQSVFYWEQVTDGFFIFNE 

RFKNFTLWLNNVAEFVCKYKLL i 


6079 


| 1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCRNLQEFLGGLSP 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
LWVKKE FSKAQEESTGLLSGLRI WHTQLLPGGLQGbl LNP I FRQ ( 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEVVL j 
HFMVGS PSAAVSQDLAQLLSQAGLMKSTEPGE P PCITSAGFQFL 

t t noiniv r\T j.tvchat OVT AfAACO^MTiT T7TTTT CT?T.T?AT .G'PQ'P'T.nTm 1 

LLiDT PAyjjW i r NjjU x J_iy lAyoKwNLLJlj vn XLior Ltr ytior o xuv?raj i 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL j 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQOI IHFLRTRAHP S 
VMLKQTP VIjP PTITDQ I J^LWELERDRLRFTEGVIiYNQFLSQVDF 
ELL \ LAHAP KLGVLVF B / NT PAKRLMWTP AGH S D VKR F W KRQK 
HSS ! 


6080 

• 


1 


1199 


IETIDHVGEFAMAAQAAGVSRQRAATQGLGSNQNALKYLGQDFK I 
TLRQQCLDSGVLFKDPEFPACPSALGYKDLGPGSPQTQGI IWKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTLNEELL 

YRVVPRDQDFQriW i Av* Xrns\i fJ-»C r r is r \r ny luow vn vv iuur 
LPTKNGQLLFLHSEQGNEFWSALLEKAYAKLNGCYEALiAGGSTV | 
EGFEDFTGGISEFYDLKKPPANLYQIIRKALCAGSLLGCSIDVY 
SAAEAEAITSQKLVKSHAYSVTGVEEVNFQGHPEKLIRLRNPWG 
EVEWSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 
QFSRLE I CNLS PDS LSS EE VHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS j 


6081 




n r~ r~ 

865 


"t?mt dt t t dt dt T WA /naT.annRRPRT.FMPR^UTVOKGLiCIFVHC 1 
prl' 'rbl v 1 1 *r I. 1 *r M -*?*■"■/ >■ ^t-ii-Lt\\s J-Z-rvrvc x\,xjcirig a>3 v x v ^mju^^i. vwv. 1 

SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRUNCSLS IRDARRRDNGS YFFWVARGRTKFS Y 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
P P I FS WMSAAPTSLGPRTLHS SVLT 1 1 PRPQDHGTNLI CQVTFP 
GAGVTTERTIQLSVSWKSGTVEEWVLAVGWAVKILLLCLCLI 
I LS FHKKKAVRAVE VEENVYAVMG 


6082 


283 


1288 


EARS PGPTQTRTAPGLAAPGLAQ PAALRLLLSRPPSAAMDGDGD 
PESVGQFiltll»AoFiilliv^^-C i Ao/U!»ij \ x \jju.ci 
L PE PLLA/ LRVLAALPRHE \ LVQACR \ LVCLRWKELVDGAPLWL 
LKCQQEGLVPEGGVEEERDHWQQFYFLSKRRRNLLRNPCGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQVIDLQAEGYWEELLDTTQPAIWKDWYSGRSDAGCIiYELTV 
KLLSEHENVLAEFSSGQVAVPQDSLX3GGWMEISHTFTDYGPGVR 
FVRFEHGGQDS VYWKGWFG AR VTNS S VWVE P 


6083 


1 1865 


309 


KQWCAERRGLGMS LAD ELLADLEEAAE E EEGG S YGE EE E EPA I E 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

J amino acid 

| sequence 


Predicted end 

nucleotide 

location 

rnTrpcinnnH i na 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=-Glycine, 
n-nistiaine, i=isoieucine, K» Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


- 






DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA- 
S EVMG P VEAAPE YRVI VDANNLTVE IENELNI IHKF I RDKYS KR 

F PELES LVPNALD Y IRTVKELGNSLDKCKNNENLQQ I LTNATI M 
WSVTASTTOGCQLSEEELERLEEACDMALELNASKHRIYEYVE 

CDMCT7T TV DMT CTT T/~* 7\ CTO 7WTM/*»TrJV/" , «y"*T TrvTT nvMnn A~n.-rf-WLit-r -r n 

o KNb r lAi'W L£> 1 1 luAb I AAK I MG VAGGLTNLS KM PACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPP PVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTE IR\KQ 
ANRMS FGE I EEDAYQEDLGFS LGHLGKSGSGRVRQTQVNEATKA 
R I S KTLQRTLQKQS WYGGKS TIRDRSSGTASS VAFTPLQGLE I 
VNPQAAEKKVAEANQKYF3SMAEFLKVKGE KSGLMST | 


6084 


1865 


309 


KQWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE 
DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA 
S EVMGPVEAAPE YRVI VDANNLTVE I ENELNI IHKFIRDKYSKR 
FPELESLVPNALDY I RTVKELGNSLDKCKNNENLQQ I LTNATI M \ 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 

olulaf JL FJN L>&±± IwAo A AfUV X IVKjr V AUvjL 1 W Lib KMPACN I M LLG 

AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL ! 
RRKAARL VAAKCTLAARVDS FHES TEGKVG YELKDE I E RKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMS FGE IEEDAYQEDLGFS LGHLGKSGSGRVRQTQVNEATKA 
R I S KTLQRTLQKQS WYGGKST I RDRS SGTASS VAFTPLQGIjE I 
WPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST j 


6085 


2 

.» , J 


1456 


SGPRSFQGNRAVGRISLGGKRNPBVTLLPGVSSERVRRWRRARV 
GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLREGEELVM 
DEEAYVL YHRAQTGAP CL S FD I VRDHLGDNRTEL PLTL YLCAGT 
QAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLELAMVPHYGG INRVRVSWLGEEPVAGVWSEKGQVEVFALR | 
KLLy V V boFQALAAr LRDriQAQMKP I FSFAGHMGEGFALDWSPR \ 
VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
S P TENTVFAS CSADAS I R I WDI RAAPS KACMLTTATAHDGDVNV 
I S WS RRE P FLLSGGDDGALKI WDLRQFKSGS PVATF KQHVAP VT 
SVEWHPQDSGVFAASGADKQITQWDLG/ IVERDPEAGDVEADPG 

LADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRT 
ISV 


6086 ; 

1 


2419 - j 


1357 

• 


GAATQHGGAMNLL P CNPHGNGLLYAG FNQDHGCFACGMENGFR V 
YNTDPLKEKEKQEFLEGGVGHVEMLFRCNYLALVGGGKKPKYPP 
in JxvwiwuuLKJvxv i VJ.isi.hr bTa vKAVKliKK\DKIVVVLDSMIKV I 
FTFTHNP \HQIJIVFE\TCYNPKGLCVLCPNSNNSLLAFPGTHTG 
HVQLVDI^TEKPPVDIPAHEGVT^CIALNLQGTRIATASEKGT 
LIRIFDTSSGHLIQELRRGSQAANIYCINFNQDASLICVSSDHG 
TVHI FAAED P KRNKQS S LAS AS FLPKYFS S KWS FS KFQ VPS G S P 

CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL j 


6087 


476 


1877 


QNSQRTGliPITIFSRSFPLLTGSDLCENMPCTCTWRNWRQWIRP 

LVAVI YLVS I WAVPLCVWELQKLEVGIHTKAWFIAGI FLLLTI 

PISLWVIWHLVHYTQPELQKPIIRILWMVPIYSLDSWIALKYP 

GIAIYVDTCRECYEAWIYNFMGFLTNYLTNRYPNLVLILEAKD 
nnTTHPPPT.Pr , r^DWZiM(^F\/T.T.Trpr»in'/^T7T.nv^rxA7DDT? r P'pTx/aT t 

VSJ*>J*-r * * JJx-vV-trJrWrti'iOe v JjijrrC^.J\J^j v lay 1 1 V VKirr 1 IXVALJL j 

CELLG I YDEGNFS FSNAWTYLVI INNMSQLFAMYCLLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTW 
EWQTVEAVATGLQDF 1 1 CI EMFLAAI A\HHYT FS YKPYVQEAEE 1 
GS C FDS FLAMWDVS D I RDDIS EQVRHVGR T VRGHPR KKLFPEDQ 
DQNEHTSLLS S SSQDAI S I ASSMPPS PMGHYQGFGHTVTPQTTP 
TTAKISDEILSDTIGEKKEPSDKSVDS | 


6088 


1684 j 


689 


GASGLVRLLQQGHRCLLAP VAP KLVP PVRGVKKGFRAAFRFQKE 
LERQRLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
KTAFVNS CYIKSEEAKRQQLG IEKEAVLLNLKSNQELSEQGTS F 
SQTCLTOFLEDEYPDMPTEGIKNLVDFLTGEEWCHVARNLAVE 
QLTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
TQMTGKELFEMWKI INPMGLLVEELKKRNVSAPESRLTRQSG\A | 
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Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVALRKLYGF 
TENRRPWNYS KPKETLRAEKS ITAS 


6089 


3 


3054 


TRIiG IPGSTISSRP RLCALAAEGHFLGHS WTGSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQS LVKHSSGI KGSLPLQK 
LHLVSRSIYHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRKLKFS 
PIKYGYQPRRNFWPARLATRLLKLRYLILGSAVGGGYTAKKTFD 
QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVLQKDD 
KG I P F I ESLRKSL I DM YS EVLDVLS D YDAS YNTQDHLPR WWG 
DQSAGKTSVLEMI AQAR I FPRGSGEMMTRS PVKVTLSEGPHHVA 
LFKDSSREFDLTKEEDLAALRHEIELRMRKOTKEGCTVSPETIS 
LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETIFSISKAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQIIEG^FPMKALGYFAVVTGKGNSSESIEAI 
REYEEEFFQNSKXiLKTSMLKAHQVTTRNLSLAVSDCFWKMVRES 
VEQQADS FKATRFNLETEWKNNYPRLRELDRNELFEKAKNEILD 
E VI SLSQVTP KHWEE I LQQSLWERVSTHVI ENIYLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDD I FDKLKEAVKEES I KRHKWNDFAEDSLRVI QHNALEDRS I 
S D KQQWDAAI Y FMEEALQARLKDTENAI ENMVGPD \WKKRWLYW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSLI KDTWHQVYRRHFLKTALNHCNLCRRGFYYYQRH 
FVDSELECNDWLFWRIQRMLAITANTLRQQLTNTEVRRLEKNV 
KEVLEDFAEDGEKKIKLLTGKRVQLAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS/ASPPLiATQTWPLQHCKI PELPVQAS IL 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 
FNLLMVTT I VLGRRFIGS I VKEASQRGKVSL FRS I LLFLTRFTV 
LTATG WS hCRS L IHLFRTYS FLNLL / FPLLS VWDVHS VPAAELR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMART P CP / PHACCLS P SL I RS EVE FLKMDFNWRMKE VLVS S ML 
S AYYVAFVPVWFVKNTHYYDKRWSCELFLLVS I STS VILMQHLL 
PASYCDLLHKAAAHLGCWQKVDPALCSNVLQHPWTEECMWPQGV 
LVKHSKNVYKAVGHYNVAIPSDVSHFRFHFFFSKPLRILNILLL 
LEGAVIVYQLYSLMSSEKWHQTISJ^ALILFSNYYAFFKLLRDRL 
VLGKAYSYSASPQRDLDHRFS 


6091 


3279 


412 


SSRTREMEEKE I LRRQ IRLLQGL I DD YKTLHGNAPAPGTPAASG 
WQ P PTYH SGRAFS AR YPRP SRRG YSSHHGPS WRKKYSLVNRP PG 
PSDPPADHAVRPLHGARGGQPPVPQQHVLERQVQLSQGQNWIK 
VKPPS KSGSASASGAQRGS LEE FEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPLLVCQKEPGKP RMVKSVGS VGDS PRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 
LLGDRRVDAGHTDQPVPSGSVGGPARPASGPRQAREASLWTCR 
TNKFRKNN YKWVAAS S KS PRVARRALS PRVAAENVCKASAGMAN 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 
SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 
RRQALRG KS S P VLKKT PNKGLVQ VTTHRL CRLPPS RAHLPTKEA 
S S LHAVRTAPTSK V I KTR YR I VKKTPAS PLS APPFPLS LPS WRA 
RRLSUSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 
KVSANKLSKTSGQPSDAGSRPLLRTGRLPPAGSCSRSLASRAVQ 
RS LAI I RQARQRREKRKE YCM Y YNRFGRCNRGERCP Y IHDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVS KEKMPVCS YFLKG I CSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 
AC PRGAQCQLLHRTQKRHS RRAATS PAPG PS DATARSR VS ASHG 
PRKPSASQRPTRQTPS SAALTAAAVAAP PHCPGGSAS PSSS KAS 
SSSSSSSSPPASLDHEAPSLQEAALAAACSNRLCKLPSFISLQS 
SPS PGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 


AKAP PTGESS E PEAKVLHTKRL YRAWEAVHRLDLILCNKTAYQ 
EVFKPENISLRNKLRELCVKLMFLHPVDYGRKAEELLWRKVYYE 
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{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=»Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






* 


VIQLIKTNKKHIHSRSTLECAYRTHLVAGIGFYQHLLIiYIQSHY 
QLELQCC IDWTHVTDPL I GCKKPVSASGKEMDWAQMACHRCLVY 
U3DLSRYQNELAGVDTELLAERFYYQALSVAPQIGMPFNQLGTL 
AGS KYYNVEAMYCYLRCI QSEVS FEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
LI FQMVI ICIiMCVHSLERAGSKQYSAAI AFTLALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSBASIASNLQAMSTQ 
MFQTKRCFRIAPTFSNIililjQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEESWRICCIRS FGHFIARLQGS ILQFNPEVGI F 
VS I AQSEQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQIiEV 
SQLEGSLQQPKAQSAMSPYLVPDTQALCHHLPVIRQIiATSGRFI 
VI I PRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKIIiDSCKQLT\LAQGAGEEDPSG 
MVTII TGLPLDNPSLLSGPMQAALQAAAHASVDI KNVLDFYKQW 
KEIG 


6093 


76 

♦ 


1002 


ACX5RRAMLALRVART/SRWGAL\RGAVWAPGTRPSKRRACWALL 
PPVPCCLGCLAERWRLRPAALGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTIPNMLSMTRIGLAP 
VLGYLI lEEDFNIALGVFALAGLTDLIiDGFIARNWANQRSALGS 
ALDPLADKI LI S I LYVSLiTYADLI PVPLTYM 1 1 SRDVMLI AAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFI S KVNTAVQLILVA 
ASLAAP VFNYADS I YLQILWCFTAFTTAASAYSYYHYGRKTVQV 
IKD 


6094 


23 

► 
* 


1010 


PFLRCLRGDQKAKMSERKVLNKYYPPDFDPSKIPKLKLPKDRQY 
VVRLMAPFNMRCKTCGE YI YKGKKFNARKETVQNE VYLGL P I FR 
FYIKCTRCLAEITFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQELKDLNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLWVKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTI RCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE \DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS: 14 1 6257. 1 (%CSH0U.DOC) 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine f M=Methionine , N-Asparagine f 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T«Threonine, V~Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








•VKVHTVPKPGKGADLSKPPCRKAKEIRKERKRLKLMQQNPAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
VTOSSPPSSQFKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYlfDPDYSFLSLGVYSALREIAFTRQIiHEKTSQLSYYY 
MG F Y IHSCP KMKYKGQ YRPSDLLCPETYVWVP I EQCLPS LENS K 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQQKDPS 
EEAAVLQYAS LVGQKCSERMLLFRN 


6096 


2277 


575 


QR VRAALLS S AMEDS EALG FEHMGLDPRLLQAVT DLGW SR PTL I 
QEKAI PLALEGKDLLARARTGSGKTAAYAIPMLQLLLHRKATGP 
WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 
EDSVSQRAVLMEKPDWVGTPSRILSHLQQDSLKLRDSLELLW 
DEADLL FS FG FE E ELKS LLCHLPR I YQAFLMS AT FNE DVQALKE 
LILHNPVTLKLQESQLPGPDQLQQFQVVCETEEDKFLLLYALLK 
LSLIRGKSLLFVNTLERSYRLRLFLEQFSIPTCVLNGBLPLRSR 
CHI I SQFNQGF YDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGI D FHHVS AVLNFDLPP T P EAY IHRAGRTARANN PG I V 
LTFVLPTEQFHLGKIEELLSGENRGP I LLP YQFRMEE I EG FRYR 
CRDAMRS VTKQAIREARLKE IKEELLHSEKLKTYFEDNPR \DLQ 
LLRHDLPLHPAWKPHLGHVPDYLVPPALRGLVRPHKK\GRSCL 
PLVGRPREQSPRTHCAASSTKERNSDPQPSPPEWGPLWS 


6097 

• 


1673 


192 


APGTMSGGKKKS S FQITS VTTDYEGPGS PGASDPPTPQP PTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLE PHS FGGLLEG I RGASGGAGGRSLDSRL 
EIASLGLGAPTP PSGLSQGPTSWLRPP PTS PGPQARS FTGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 
RVEAEAGGSGARTPPLSRRKAVDMRLRMELGAPEEMGQVPPLDS 
RPSSPALYFTHDASLVHKSPDPFGAVAAQKFSLAHSMLAISGHL 
DSDDDSGSGSLVGIDNKIEQAMDLVKSHLMFAVREEVEVLKEQI 
RELAERNAALEQENGLLRALA\SPEQLGSAGPPRGVPR\LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ k 


6098 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\ Q FLP LDDSE EK\ T YSE KAT \DNI VNHS S C P E P VPNGVKKVS VR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6099 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSN1NPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLS PGFSHLLSKNESS PIRFD ILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWE KNKS VS YEQCKPVS VTPQGNDFEYTAKI RTLAETERFF\D 
ELTKEKDQ IEAALSRMPS PGGRI TLQTRLNQEAFGRS FGKD 


6100 


2 


713 


FVEVSGYRSRADPEPRGRDTMTYAYLFKYI I IGDTGVGKSCLLL 
QFTD KRFQPVHDLTIGVEFGARMVN IDGKQI KLQIWDTAGQES F 
RSITRSYYRGAAGALLVYDITRRETFNHLTSWLEDARQHSSSNM 
VI ML I GNKSDLESRRDVKREEGEAFARE \HGLI FMETS AKTACN 
VEEAF INTAKE I YRKI QQGLFDVHNE ANG I KI GPQQS I STSVGP 
SASQRNSRDIGSNSGCC 


6101 


1 


1399 


FRGRAWPLREVSHWLGCRRVCSWSASWGRLPALSARLSPLLAFR 
GKMVFPLSCAVQQYAWGKMGSNSEVARLLASSDPLAQIAEDKPY 
AELWMGTHPRGDAKILDNRISQKTLSQWIAENQDSLGSKVKDTF 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine r D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine , 
S^Serine, TVThreonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFL FKVLSVETPLS IQAHPNKELAEKLHLQAPQHYPDANH 
KPEMAIALTPFQGLCGFRPVEEIVTFLKKVPEFQFLIGDEAATH 
LKQTI^HDSQAVASSLQSCFSHLMKSEKKVVVEQLNLLVKRISQ 
QAAAGNNMEDIFGELLLQLHQQYPGDIGCFAIYFLNIiLTLKPGE 
AMFLEAWPHAYLKGDCVECMACSDISrrVRAGLTPKFIDVPTLCE 

mlsytpssskdrlflptrsqedpylsiydppvpdftimkaXevp 

G\S\rrEYKDLAr£»SASILLMVQGTVIASTPTTQTPIPIiQRGGVL 
FIGANESVSLKLTEPKDLLIFRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGEIGASPAAPCCSESGDERKN 
LEEKSDINVTVLIGS KQ VSEGTDNGDLPS YVSAFI EKEVGNDLK 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNABES 
KQFLNQFLEQETHLFSAINSHLLTAQPWMDDLGTM I SQI EE I ER 
HLAYLKWISQIEELSDNIQQYLMTNNVPEAASTLVSMAELDIKL 
QESSCTHLLGFMRATVKFWHKILKDKLTSDFEEILAQLHWPFIA 
PPQSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPK\ 
HSQKNTLFLP PLLS S / W P IQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEWYLAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 
LMMLVLEKIATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGYP 
GTFASCMHILSEETCFQRWLTVERKFALQKMDSMIjSSEAAWVSQ 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDhVDD FR I RLTQ VMKEETRAS LGFRYCAI LNAVNY I S TVLA 
DWADNVFFLQLQQAALEVFAENNTLS KLQLGQLASMESS VFDDM 
INLbERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KH I KEACI VLNLNVGSALTAGKDVLP VQLQGS FPAT 


6103 

* 
• 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRFSWNVWPSSRIjEATRMWPVA 
ALFTPLKERPDLPPIQYEPVIjCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQFPPSYAG I S SLNQPAELLPQFS S I E Y WLRGPQM 
PL I FLYWDTCMEDEDLQALKESMQMSLSLLP PTALVGL I TFGR 
MVQ VHELG CEGIS KS YVFRGTKDLS AKQLQEMLGLS KVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 
PLRSSGVALS IAVGLLECTFPNTGAR IMMFIGGPATQGPGMWG 
DELKTP IRSWHD I DKDNAKYVKKGTKHFEALANRAATTGHVI DI 
YACALDQTGLLEMKCCPNLTGGYMVMGDS FNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENEIGTGGTCQWKICGLSPTTTTiAIYFEVVNQHNAPIPQGG\RG 
A\IQFVTQY\QHSSGQRRIRVTTIARN\WADAQTQIQNIAASFD 
QEAAAIIjMARLAIYRAETEEGPDVLRWLDRQLIRLCQKFGEYHK 
DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRILLM 
DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 
I LHSRFPMPR YI DTEHGGS QARFLLS KVNPSQTHNNMYAWGQES 
GAP I LTDDVSLQVFMDHLKKliAVSSAA 


6104 


124 


732 


KVSEY 1 1 LS KDKI L FHALAM LV LWS P WSAARGVLRNYWERLLR 
KLPQSRPGFPSPPWGPALAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSIFWMAAPKNRRTIEVNRCRRRNPQKLIKVKNNIDVCPECGH 
LKQKHVLCAYCYEKVCKETAEIRRQIGKQEGGPFKAPTIETVVL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


989 


PLHGACTSLVLQRFCHRRPRPCAPARPEDMRRPAAVPIiLLLLCF 
GSQRAKAATACGRPRMLNRMVGGQDTQEGEWPWQVSIQRNGSHF 
CGGSLIAEQWVLTAAHCFRNTSETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASS ADVALVELEAPVPFTNYI LPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYOPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQS WLQAGVI S WGEGCARQNRPGVYIRVTAHHNWIHRI I PK 
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Aulino acid S eg menu concainin^ biguai pc^uxae 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P^Proline , Q=Glut amine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 

» 


1302 


GRP PTAFHTGRP PTAN KWDFKljDl)iU<.oUAKlilj 1 o IfibKbKriiAo 
AGLRRDRCALRRWPLRRAPLARATRRRAGS PRRCAPRPRACPQG 
WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWIiRQAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 

± r L-K M V^JttW V Jj LJj ir\J iSJxv, Ki v LLN JVKJN JVr* Kl- V L-rlir U wOiN X 1 WI\.\J r VL 

GLDGKTYRNECAL.LKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\ VDQTNNAYCVTCNRI CPE PAS SEQYLCGNDGVTYS \SAC 
HLRKATCLLGRS IGLAYEGKCI KAKS CED IQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGS CNS I S EDTEEEEEDEDQDYS FPISS ILEW 


6107 


623 


168 


SRCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDBDHKGYLSRE 
DFKTAWMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKEAQRYRNEVRHIFTAFDTYYRGFLTLEDFKKAFRQVAPKLPE 
RTVLE VFREV \ DRDS\ DGH VS F 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 
CLPRRYMKHKRDDGirliKyQUillA.vJJv 1 FVOTXwvj* V VlYiUv^oWijVLLU 
YYFYDLLVYWIGI FCLASATGLYSCLAPCVRRLP\SASAGESA 
LLAPTI PNNS LPYFHKRPQARMLLLALFCVAVS WWGVFRNEDQ 
WAWVLQDALGIAFCLYMLKTIRLPTFKACTLLLLVIjFLYDIFFV 
FITP FLTKSGS S I MVEVATGPSDSATREKLPMVLKVPRLNSS PL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
AYGVGLL VT FVALALMQRGQ PALLYIiVPCTLVTSCAVALWRREL 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS P W PAEQS P KS RTS EE MGAGAPMREPGS PAES EGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 

• 


CRSRAGAASGGAI LEGTKLRRQRVDTNKPLDPLVPSALRAAMLY 

I£DYLEMIEQLPMDLRDRFTEMREMD1jQVQNAMDQJ^ 

MNAKKNKPEWREEQMASIKKDYYKALEDADEKVQLANQIYDLVD 

RHLRKLDQELAKFKMELEADNAGITEILERRSLEIiDTPSQPVNN 

HHAHSHTPVEKRKYNPTSHHTTTDHI PEKKFKSEALLSTLTSDA 

SKENTIjGCRNNNSTASSIMAYKVNSSQPLGSYNIGSLSSGTGAG 

G I \TMAAAQAVQATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 

SMARETVGYSSSSALMTTLTQNASSSAADSRSGRKSKNNNKSSS 

QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 

YDPNEPRYCICNQVSYGEMVGCDTQDCPIEWFKYGCVGLTEAPK 

GKWY C PQ CT\ AAMKRRGSRHK 


6110 


77 


2464 
* 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLALLAATCSR I E S PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 
S STNGSNGS ESSKNRTVSGGQ YWAAAPNLQNQQVLTGLPGVMP 
N IQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQ I Q 1 1 PGANQQ 
I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQ YVTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTISSASLVSSQAS S SSFFTNANSYSTTTTTSNMG I MNFTTSG 

ssgtnsqgqtpqrvsglqgsdalniqqnqtsggslqagqqkege 
o\nootqaapkslsrpolvogg\oalq\afqaaplsgqtfttqa 
i sqetlqnlqlqavpnsgp i i i rtptvgpngqvswqtlqlqnlq 

VQNPQAQT I TLAPMQGVSLGQTS S SNTTLTP I ASAAS I PAGTVT 
VNAAQLSSMPGLQTINLSALGTSGIQVHPIQGLPLAIANAPGDH 
GAQLGLHGAGGDG I HDDTAGGE EGENS PDAQPQAGRRTRREACT 
CP YCKDS EGRGSGDPGKKKQH I CH IQGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLSKHIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seqment containing sicmal Deotide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, +«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6111 


1637 

• 


797 


RVD PR VRGAMAP WG KRLAGVRG VLLDISGVL YDSGAGGGTAI AG 
S VEAVARLKRSRIjKVRFCTNESQKSRAELVGQLQRLG FDI SEQE 
VTAPAPAACQILKERGLRPYLLIHDGV\ASEFDQIDTS/STPNC 
WIADAGESFSYQNMNNAFQVLMELEKPVLISIjGKGRYYKETSG 
LMLD VGP YMKALE YACG I KAEVGGKPS PE FFKS ALQA I GVEAHQ 

AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


6112 


77 




MS SHKS FKS KRFLAKKQKPNRPI LQWI WLKTGNKIRHNWK 


6113 


1779 


567 


WEGRSWAACGVNLQGAWGERSGVRASEAESPGKRADVSWWSROL 
ETMVDH LANTE INSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPRIFFLFNDILVYGSIVLNKRKYRSQHIIPLEEVT 
LELLPETLQAKNRWMIKTAKKSFVVSAASATERQEWISHIEECV 
RRQLRATGRPA\ STEHAAPWI PDKATDI CMRCTQTRFSALTRRH 
HCRKCRWVCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAEEQGAGVPRAASHLARP I CGRP VEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPRPAEHliSPSQLHCPGPQEGSSRSC 
PGLRDP I PWWQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFR 
KPQNTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHLPPHADRRALRLPVAAPARGPGPGHPAGPAGPRPARTPPASP 
HGPGRPTVPAPPCPLLAATEPTPSRPHQRWTREDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6115 . 


324 


71 


DVCGRVCAHPHL YTH I HMHI CAHAC \ IHTHAQLC/ ITASHALAH 
S HL YTCMVMLTAS HT P SHTHPHTAVHKEHRAD VLRGTLTP LR 


6116 


595 


1430 

* 


TGVMP PGR WHAA/ 1 S S SGPVFEGARA\ LQTVKKBEEDES YT P VQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQILELLVLEQFLSILPGELRVWVQLHNPESGEE\L 
WPCWRS CRGTLMGHPGGTRAIiP \ EPRCALDGYRS \ LRSAQI WS L 
ASPLRSSSALGDHLEPPYE IEARDFLAGQSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 

• 


VGVPS PAP PCS WE VG PGGGWTPG I LKEGQGGRRTPLLLLATRTR 

GLLSLFPPAAMHPAAFPLPVWAAVLWGAAPTRGLIRATSDHNA 

SMDFADLPALFGATLSQEGLQGFLVEAHPDNACSPIAPPPPAPV ' 

NGS VF I ALLRRFDCNFDLKVLNAQKAG YGAAVVHIWNSNELLNM I 

VWNSEBIQQQIWIPSVFIGERSSEYLRAIjFVYEKGARVLLVPDN 

TFPI/3YYLIPFTGIVGLLVLAMGAVMIARCIQHRKRLQRNRLTK 

\EQLKQ I \ PTHDYQKGDQ YDVCAI CLDE YEDGDKLRVLPCAHAY 

HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDE 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 

SPPSSPVILV 


6118 


1044 | 


247 


STISCRACTSGATPGAQSHRSARGHAAGGKETAALGMERGKVKK 
KEKEKETQKEKIGEKGREEKVKRKEVEQKIKQEKQEKQERRKGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 
DSQMEFLEIGGSKPFRSYWEMYLSN/ADSLARSFSVGFKQDSQP 
ITWKAKKYLHQLIAANPVLPLWFANKQDLEAAYHITDIHEALA 
II 


6119 


1217 \ 


462 


DPRFVTENTTKAPAQBRTTQPRSSREGTLRSTMEYLSALNPSDL 
LRS VSN ISSEFGRRVWTS APP PQRPFRVCDHKRT I RKGLTAATR 
QELLAKALETLLLNGVLTLVLEEDGTAVDSEDFFQLLEDDTCLM 
VLQSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
DLFGSLNVKATFYGLYSMSCDFQGL\GPKKVLRELLRWTSTLLQ 
GLGHMLLGI SSTLRHAVEGAEQWQQKGRLHSY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=>Glutamine , R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LERAGGGGLSSRALVGSGACLSLVARANGKGLPRGRKEFVEAVR 
VRYVAFR YRTPRAVCLRLWS CRREVIMSGRGKQGGKVRAKAKSR 
SSRAGLQFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAE 
ILELAGNAARDNKKTRI IPRHLQLAIRNDEELNKLLGKVT I AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


FVRAQARGSRQPVRRPIjLGAGSRLRCRSCGRMEPLKVEKFATAN 
RGNGLRAVTPIjRPGELLFRSDPLAYTVCKGSRGWCDRCIiLGKE 

klmrcsqcrvakycsakcqkkawpdhkreckclksckpryppds 
vrllgrwfklmdgapseseklysfydlesninkltedkkeglr 

QLVMTFQHFMREE I QDAS QLPPAFDLFEAFAKVI CNS FTICNAE 
MQEVGVGLYPSISLLNHSCDPNCSIVFNGPHIiLLRAVRDIEVGE 
ELTICYLDMLMTSEERRKQLRDQYCFECD\CFRCQTQDKDADML 
TGDEQVWKEVQESLKKIEELKAHWKWEQVLAMCQAIISSNSERL 
PDINIYQLKVTjDCAMDACINLGLIiEEALFYGTRTMEPYRIFFPG 
SHPVRGVQVMKVGKLQLHQGMFPQAMKNLRLAFDIMRVTHGREH 
SlilEDLILLLE /AMRRQHQS I LRERSQRE IRRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG 
NTGTQTNGLDFQKQPVPVGGAISTAQAQAFIiGHLHQVQLAGTSL 
QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 
GGQ I TGLTLTP AQQQLLLQQAQAQAQLLAAAVQQHSASQQHS AA 
GATISASAATPMTQIPLSQPIQIAQDLQQLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 
LLQSQPRI\TLTSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEE P \ SDLE ELEQFAKT FKQRR I KLGFT\QGDAGLAMVKL YGND 
FS PTT I FRFEALNLSFKNMCKLKPLLEKWLNDAENLSSDS SLSS 
PSALNSPGIEGLSRRRKKRTSIEA\NIRVALEKSFLEN\QKPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 
IKAIFPSPTSLVATTPSLVTSSAATTLTVSPVLPLTSAAVTNLS 
VTGTSDTTSNNTATVISTAP PASSAVTS PSLSPS PSASASTSEA 
SSASETSTTQTTS TPbS SPLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASLAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIQALASGGSLP ITSLDATGNLVFANAGGA 
PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 
TSTSAES IQNSLFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL 
HLQPLEMKRVGWFTPADYGKVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILS IT 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEVJjDCHQFSLDPNT 
S RDI S I VFT PDFTS S WVIRDLS LVTAADLE FRFTLNVTLPHHLL 
PLCTuOWPGPSWEESFWRLTVFFVSLSLLGVILIAFQQAQYILM 
E FMKTRQRQNAS S S S QQNNGPMDVIS PHSYKSNCKNFLDTYGPS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK ! 
HKTSTAAASS TSTTTEEKQTS PLGSSLPAAKEDI CTDAMRENWI | 
S LR YASG I NVNLQKNLTLP KNLLNKE ENTLKNT I VFSNPSS ECS 
MKEGIQTCMFPKETDIKTSENTAEFKERELCPLKTSKKLPENHL 
PRNSPQYHQPDLPE I SRKNNG1WQQVPVKNEVDHCENLKKVDTK 
PSS EKKIHKTSREDMFS EKQD I PFVEQEDPYRKKKLQEKREGNL 
QNLNWS KS RTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCS DSSSDCGS S SGSVRASRGSWGSWS STSSSDGDKKPM 
VDAQHFLPAGDSVSQNDFPSEAPISLNLSHNICNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 
NGVPCV I QESAP VHNS F I DWS ATCEGQFSSAYCPLELND YNAFP 
EENMNYANGFPCPADVQTDF I DHNSQSTWKTP P\NMPAS \ WGNA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN | 
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SEQ 
XD 

NO: 


Predicted 
beginning 
nucleotide 
location 
! corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
I (A=Alanine, C=Cysteine, D=Asnartic Arid v— 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lvsine. 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








PTTEHSD/THMENQA\WCKEYYPGF\NPFRAYMNLDIWTTT\A 
NRNANFPJjSRDSSYCGNV 


6124 


1573 


236 


SDEALRLAGERGMGRVQLFEISLSHGRWYSPGEPLAGTVRVRIj 

GAPLPFRAIRVTCIGSCGVSNKANDTAWWEEGYFNSSLSLADK 
GS LPAGEHS FP FQ FLLPAT APTS FEG P PGKI VHOVP a a t ktdt? t? 

SKDHKCSLVFYILSPLNLNS I PDIEQ PNVASATKKFS YKLVKTG 
SVVLTASTDLRGYVVGQALQliHADVENQSGKDTSPVVASLLQKV 
SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 
PGCSLIHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSS VPGAPEPCPQDGS PASHPLHPPLCI STGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEOSCGGVE 
PSLTPES 


6125 


1 


904 


KTCPKLTCAFTVS VPDS CCRVCRGDGELS WEHSDGD I FRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
I VQ I VINNKHKHGQ VCVSNGKT YS HGES WHPNLRAFG I VECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEBL 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGI LQHFHI EKI S KRM FEELPHFKLVTRTTLSOWKT 

FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


| 389 


RLLSEAP CPRSRRRFQMNPE WGQAF VHVAVAGGLCAVAVFTG I F 
DSVSVQVGYEHYAEAPVAGLPAFLAMPFNSLVNMAYTLLGLSWL j 
HRGGAMGLGPRYLKDVFAAMALLYGPVQWLRLWTQWRRAAVLDQ 
WLTLPIFAWPVAWCLYLDRGWRP\WLFLSLECVSLASYGLALLH 
PQG FEVALGAHWPAVGQALRT \HRHYG/SATP SATYLALGVLS 
CLGFVVLKLCDHQLARWRLFQCLTGHFWSKVCD VLOFHFAFT.pt. 

■ S_ ■ ~ ~ r * ^^^tai ^ A A Ai » v AV ™ i w AJy* A> AAA. AA Aj Aj A. J^J 

THFNTHPRFHPSGGKTR 


6127 


1335 


463 


vlprrclvfwntmdssrsptlgrldaagfwqvwqrfdadekgy 
I eekeldafflhmlmklgtddtvmkanlhkvkqq fmttqdas kd 
grirmkeiagmflsedewflllfrrenpldssvefmqiwrkyda, 
dssgfisaaelrnflrdlflhhkkaiseajkleeytgtmmkifdr 

NKIXjRLDLNDIJ^ILALQENFLLQFKMDACSTEKRKGDFEKI fa 

YYDVSKTGALEGP\EVDGFVKDMMELVQPSISGYDLDKFREILIi. 

RHCDVNKDGKIQKSELALCLGLKINP 


6128 

■ 


251X 


843 


TC^SRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 

SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR" 

GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 

MSSQWGIEPLYIKAEPASPDSPKGSSETETEPPVALAPG\PAP 

TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 

GYHYGVASCEACKAFFKRTI QGSIEYS CPASNECE ITKRRRKAC 

QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 

FPAGPLAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 

GHLPAVATIiCDLFDREI WTI SWAKS I PGFSSLSLSDQMS VLQS 

VWMEVLVLGVAQRS ltlqdelafae ylvldeegarpaglgelg \ 

AAL VRRLQALRLEREE YVLLKALALANSDS VHI EDEPRLWS 
SCEKLLHEALLEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
KVLAH F YGVKLEGBCVPMHKL FL3MLEAMMD 


6129 


1764 j 


771 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKTKCMM 
KSSIXIVIKHAGVYSTGLAIWGAICDFCEAVfVCHGRKCLSTHACA 
CPIiTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQVLEAE T FKCVS CNRLGQHS CLR CKACFCDDHTRS KVFKQE KG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 


6130 


3 | 


577 


GRGGTMRE YKWVLG SG \ GV GKS ALTV \QF VTCTF I EKYDPTIE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan , Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKE I EV \DS S PS VAG I S WTQQGTEQF \ ASMRDL Y I KKGQGC 
ILVYSLVNQQSFQ\DIKPMRDQIIRVKVSEKVPVI\LVGN\SVD 

LES ere vsssegralaeewgcp FMETSAKSKTMVDELFAE I vrq 

MNYAAQPDKDDPCCSACNIQ 


6131 


3 

• 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGSPRHLPSCSPALLLLVLGGCLGVF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRASILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QEPNTFPAILRSMCGYQTFF\AGKYLNEYGAPDAGGLEHVPLGW 
SYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSL 
DFLDYKSNFEPFFMMTATP\APHSPWTAAPQYQKAFQNVFAPRN 
KNF^IHGTNKHWLIRQABCTPMTNSSIQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYI FYTSDNGYHTGQFSLP IDKRQLY 
EFDIKVPLLVRGPGIKPNQTSKMLVANIDLGPTILDIAGYDLNK 
TQMDGMS LLP ILRGASNLTWRS DVLVEYQGEGRNVTDPTCPSLS 
PGVSQCFPDCVCEDAYNNTYAC7RTMSALWNLQYCE FDDQEVFV 
EVYNLTADPDQITNIAKTIDPBLLGKMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 


96 


1241 


AAGLLPPGLVPEDPRRTRNLLPFGIQGPPFALSRPLFSCVESGW 
AWEAMEPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRLAALKLEAEDIALTATSQKHKLTWLEAVNRS\CSWRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDEPPKDVFDELFKLAPEKVNA 
VKEAIVNFVNQKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEFYLTPNSPAENLHNVTLALELL/ IGRGPAQLPC/LALK/ 
TIVNKDAKSTLRVLYGLFCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VSQQPVSAPVPIAAHAS VAGHLSTSTTVSS SGAQNSDSTK 
KTLVTLI ANNNAGNPLVQQGGQPL I LTQNPAPGLGTMVTQPVLR 
PVQVMQNANHVTSSPVMQPIFITTQGFPVRNVRPVQNAMNQVG 
IV]^CX3GQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQLAVQSPGQSNQTT^IPKLAPSFPSPPAVSIASFVT 
VKRPGVTGENSNEVAKLVWTLNTIPSLGQSPGPVVVSNNSSAH\ 
GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQBCKGKSLDSEPSVPSAAKPPSPEKTAPYAS 
/THPS STP I PALS P P Y/TKVPE PNENVGDAVQTKLI MLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPITOFPTYVHCSLCRYST 
CCSRAYANHM INNHVPRKS PKYLALFKNS VSG I KLACTS CTFVT 
SVGDAMAKHLVFKPSHRSSS ILPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEX3KYLSFEAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLF I DFVQRQI HNQDLPLSM I VAIDE I SLFL 
DTEVLSSDDRKENALQTVGTGEPWCDWLAIIiADGTVLPTLVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
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SEQ 
ID 
NO: 


Predicted 
becrinnincr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

f A a ZXl aril rip P a rvflt"oi t*\ n ot^^ v^- i 7\ /-» -J J -c» 

\.*i=*.tt-Lcin.Liic: , ^*=v.y o ucine , U-nSparu XC ACla f fis 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 

S=Serine, T=Threonine, V=: Valine, 
W=Tryptophan, Y=Tyrosine, X»=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


• 






DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLWLGEV 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPBSLHQLPEGESETES 
FYGFEEADLDLMEI 


6134 


2 

* 
* 


4256 

* 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 

TTVSVSQQPVSAPVPIAAriASVAGHLSTSTTVSSSGAQNSDSTK 

KTLVTLI ANNNAGNPLVQQGGQPLI LTQNPAPGLGTMVTQPVLR 

PVQ VMQNANHVTSS PVASQP I F I TTQG FPVRNVRPVQNAMNQVG 

IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP ! 

TATQPTSLGQIAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFOT 

VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPWVSNNSSAH\ 

GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 

/THPSSTP I PAIiS PPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 

VHFPJVIIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 

I QKRAVRKMS VMGRQTCLE CS FE I PDFPNHFPTYVHCSLCRYST 

CCSRAYANHM I NNHVPRKS P KYLALFKNSVSG I KLACTS CTFVT 

SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 

NVKNMYP P PS F P TNKAATVKS AGATPAE P EE LLTP LAPAL PS PA 

STATPPPTPTHPQALALPPLATEGAE CLNVDDQDEGS PVTQEPE 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 

RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 

VNaiii ii-jr yivAl JvXoKoljiiti<jr Klbi JiWAvKr iVIXjKHIIIiTPxiARKA 

VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 

DTEVLS S DDRKENALQTVGTGE P WCD WLAI LADG TVLPTLVFY 

RGQMDQPANMPDS ILLEAKESGYSDDE IMELWSTRVWQKHTACQ 

RSKGMLVMDCTRTHLSEEVLAMLSASSTLPAVVPAGCSSKIQPL 

DVCIKRWKNFLHKKWICEQAREMADTACDSDVLLi3LVLVWLGEV 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLME I 


6135 


2 

• 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 

TTVS VSQQPVS AP VP IAAHASVAGHLSTSTTVS SSGAQNSDSTK 

KTLVTLI ANNNAGNPLVQQGGQ PIj I LTQNPAPGLGTMVTQPVLR 

PVQVMQNANHVTSS PVASQP I F I TTQG FPVRNVRPVQNAMNQVG 

IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TATQPTSLGQLAVQSPGQSNQTTNPKLAPS FPS PPAVS IASFVT 

VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 

GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 

/THPS ST P I PALS PP Y / TKVPEPNENVGDAVQTKLIMLVDDF YY 

GRDGGKVAQLTNF PKVATS FRCPHCTKRLKNN I RFMNHMKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

KICEWAFESEPLFLQHMKDTKKPGEMPYVCQVCQYRSSLYSEVD 

VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKXjQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 

IQKRAVRKMS VMGRQTCLECS FE I PDFPNHFPTYVHCSLCRYST 

CCSRAYANHMINNHVPRKSPKYLALFKNSVSGI KLACTS CTFVT 
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bey 
ID 
NO: 


rreaicLcu 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Freaiccea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S VGDAMAKHLVFNPS HRSS S I L PRGLT W I AHS RHGQTRDRVHDR 
NVKNMYP P PSFPTNKAATVKSAGATPAE PEELLTPIiAPALPS PA 
STATP PPTPTHPQALALPPLATEGAECLNVDDQDEGS PVTQE PE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLS FEAEEKLAEWVIiTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFIi 
DTE VLS S DDRKENALQTVGTGE PWCDWIiAIIiADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYS DDE IMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLS AS STLPAWPAGCS S KIQPL 
DVCIKRTVKNFI^KKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
ribr hiKAULiDLMbl 


6136 


1704 


539 


FGVRMALEGMS KRKRKRS VQEGENPDDGVRGS P PEDYRLGQYAS 
SLFRGEHHS RGGTGRLASLFSSLE PQIQPVYVPVPK\ ESALASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKIIiDDTEDTWSQRKKIQ 
INQEEERLKNERTVFVGNLPVTCN KKKLKS FFKE YGQ IES VRFR 
SLIPAEGTLSKKLAAIKRKIHPDQKNINAYWFKEESAATQALK 
RNGAQ I ADG FR I RVDLAS ETS S RDKRS VF VGNLP YKVEES AI EK 
HFLDCG S IMAVR I VRDKMTG IGKG FG YVLFENTD S VHLAL KLNN 
SELMGRKLRVMRSVNKEKFKQQNSNPRLKNVSKPKQGLNFTSKT 
AEGHPKSLFIGEKAVLLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNML I VAMCIiA\LIjGLPGKAQELQGHVS \ 1 1 LAGEQLGDLAKK 
YLWQG \ LFQL YLDEAGRGHS FS FHGAALTAP KQGQELMAKALES 
LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKDIEAQ 
LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 
LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYIjGKQAVAQIIi 
PFRDQNRKALDGLWNRHHVERVEI IMKETVDAEGRTSFYEEYGV 
IRDVLQNHLTEVLTLVAMELPHNVSSAEAVLRHKLQVFQALRGL 
QRGSAWGQYQSYSEQVRRELQKPDSFHSLTPTFAGVLVHIDNL 
RWEGVP F I LMSGKALDERVGYARILFKNQACCVQSEKHWAAAQS 
QCLPRQLVFHIGHGDLGSPAVXjVSRNLFRPSLPSSWKEMEGPPG 
LRL FG S PIiSD YYAYS P VRERDAHSVLL SHI FHGRKNFF ITTENL 
I^WNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSIiVSAWSEELISKL 
ANDIEATAVRAVRRFGQFHLALSGGS S PVALFQQLATAHYG FPW 
AHTHLWLVDERCVPLSDPESNFQGLQAHLLQHVRIPYYNIH\AM 
PVHLQQRLCAEEDQGAHI YARE ISALGANSSFDLVLLGMGADGH 
TASLFPQSPTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 
AVL VMGRMKREI TTLVSRVGHE P KKW P I SGVLPHS GQLVW YMD Y 
DAFLG 


6138 


4587 


934 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
TDTSHLLSAVKGQERFStiYQTRSLIHELKNKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 

VDQTVFTMnon?Tf Y T K FT .V ^ P T ■HVT . K" Z\H <3 FT) PT .PF T.WF T">T . WKTP TT 

ELI KELEQSLASWTQNLKELQTMKADLTRHVLVEDVMVLKEQI E 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTAD I S IEEM I EKLQKDCMEE INLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNMSNLRTWLARIESELSKPWYDVCDDQEIQKRLAEQQD 
LQRDIEQHSAGVESVFNICDVLLHDSDACANETECDSIQQTTRS 
LDRRWRN I CAM S MERRMK I EETWRL WQKFLDDYS RFE DWLKS AE 
RTAACPNS S E VL YTS AKEE LKR FEAFQRQ I HERLTQLEL INKQ Y 
RRLARENRTDTASRIiKQr^VHEGNQRWDNLQRRVTAVLRRLRHFT 
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SEQ 
ID 
NO: 


Predicted 
becrinnincr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

11 Uw X CU Lp. xuc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
\A=Aianine, C=*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknovm, *=Stop 
Codon, /«pos sible nucleotide deletion, 
\«poesible nucleotide insertion) 


1 • 






NQREEFEGTRESILVWLTEMDLQLTNVEHFSESDADDKMRQLNG 
FQQEITLNTNK1DQLIVFGEQLIQKSEP\LDAVLIEDELEBLHR 
YCQEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMEDPREIQT 
DS WRKRGES EE P S S PQS LCHLVAPGHERSGCETPVSVDS \ IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAGI TEQQSGAFDRWEMIQAQEL \HNK 
LKIKQNLQQLNSDISAITTWLKKTEIAELEMLKMAKPPSDIOEIE 
LRVKRLQE I LKAFDTYKALWS VNVSS KE FLQTES PESTELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNLLLWLA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQEISNSLLIKGHGEDCIEAEEKVHVI\EKKLKQLREQVSQDLM 
ALQGTQNPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSFLSRWRAALPLQLLLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 






J 171 


LGDWWSRTCX5VLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTLS C IRWYRRESMFGFFKGMS FPLAS I AVYNS WFGVFSN 
TQRFLS QHRCGEPEAS PPRTLSDLLLASMVAG WSVGLGGPVDL 
IKIRLQMQTPPVSGRQPRFEVQGSGSCX3\EPAYQGPVHCITTIV 
KWJiGLlAGXIYRGASA^IIlXlRDVPGYCIJYFI PYVFLSEW ITPEACTG 
PS PCAWLAGGMAGAI S WGTATPMDVVKSRLQADGVYLNKYKGV 

LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


694 


136 


R P ELELWRLRS RS WRPliG VPRR CHRRNWKE P VRAQ P LS VTVWAP 
RCQRP/ QPPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGSRIiGPETFRQRFRQFRYQDAAGPREA 
. FRQLREL/SPRQWLRPDI \RTKBQ\ I VEMLVQEQLLAILPEAAR 

AKKJ.KKKIDVRITG 


6141 


2 


984 


AQVGPRSRP CKMPLKLRGKKKAKS KE TAGLVEGE PTGAGGG S LS 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQELYAQIAGAFEIS 
P S E I LYCTLNTPKIDMERLLGGQLGLEDFI FAHVKGIE KE VNVY 
J^£U±31A3xj1 X I UNbVCjYAr IKKIKDGGVIDSVKTICVGDHIESI 
NGENIVGWRHYDVAKKLKELKKEBLFTMKLIEPKKAFE I ELRSK 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLBLYMG IRDI DLATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFDVWGVIGDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETAR IGPGVMES KEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FRVRQPIL<3YRWDIMHRLGEPQARMREENMERIGEEVRQLMEKL 
REKQLSHSLRAVSTDPPHHDHHDEFC\ LMP 


6143 


2802 

* 


270 


FRMRIFLHCPWNQQMWKIWNLLETSIiESCKAHLSIQKLLKER\Q 

\QLPVFKHRDS IVETLKRHRWWAGET\GSGKSTQVPHFLLED 

LLLNEWEASKCNIVCTQPRRISAVSIiANRVCDELGCENGPGGRN 

SLCGYQIRMESRACESTRLLYCTTCVLLRKLQEDGLLSNVS/HM 

FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 

KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
0 KFLE EEE EVT I NVTS KAGG I KK YHP Y T P VOTGAHAIYT ,TJPT? vnif 

YSSRTQHAlLYMNPHKINLDLILELIiAYLDKSPQFRNIEGAVLI 
FLPGLAHIQQLYDLLSNDRRFYSERYKVIALHSILSTQDQAAAF 
TLPP PGVRKI VLATNI AETG I T I PDWFVI DTGRTKENKYHES S 
QMSSLVETFVSKASALQRQGRAG^VRDGFCFRMYTRERFEGFMD 
YS VPE I LRVPLEELCLHIMKCNLGS PEDFLS KALDPPQLQVI SN 
AMNLLRKIGACELNE PKLTPLGQHLAALP VNVK IGKML I FGAI F 
GC1JDPVATLAAVMTEKSPFTTPXGRKDEAD1AKSALAMADSDHL i 
T I YNAYLGWKKARQEGG YRS E I TYCRRNFLNRTSLLTLED VKQE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ess 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K^Lysine, 
^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine / T=Threonine, V^Valine, 
W=Tryptophan # Y*=Tyrosine, X=TJnknown, *=Stop 
Codon, /=»possible nucleotide deletion, 
\»possible nucleotide insertion) 








L I KLVKAAGFSSSTTSTSWBGNRASQTLS FQEIALLKAVLVAGL 
YDNVGKIIYTKSVDVTEKIACIVETAQGKAQVHPSSVNRDLQTH I 
GWLLYQEKI RYARVYLRETTL I TPFP VLLFGGDIEVQHRERLLS 
IDGW I YFQAPVKIAVI FKQLRVLI DSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


6144 


1289 
> 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN 
VSASGGARHGGRGSGGPVICT YGPDLFPLVA\ TTGAAFVAKVMS 
VGDRTVTIiGIWDTAGSERYEAMSRIYYRGAKAAIVCYDLTDSSS 
FERAKFWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV ! 
QDYADNI KAQLFETSS KTGQS VDELFQKVAEDYVSVAAFQVMTE 
DKGVDLGQKPNPYFYS CCHH 


6145 


1109 


196 


GGMDLS ELERDNTGRCRLSS P VPAVCRKB PCVLG VDEAGRGPVL 
GPMVYAI CYCPLPRLADLEALKYADS KTLLES ERERLFAKMEDT 
DFVGWALDVLS PNL I S TS MIX3RVKYNLNSLSHDTATGL I Q YALD 
QGVNVTQVFVDTVGMPET YQARLQQS FPG IEVTVKAKADALYPV 
\VSAASICAKVARDQAVKKWQFVEKLQDLDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


IiKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVLTDEQKS 
R / YPGHE AHDQGG\ WDARQ3 1 IRKWDPETGRTRLI KGDGEVLE 
E I VTKERHRE I NKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKF ISATDTI RKMKNDFRKME 
DEMDRLATNMAVITDFSARISATLQDRHERITKLAGVHALLRKL 
QFLFELPSRLTKCVELGAYGQAVRYQGRAQAVLQQYQHLPSFRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
EELCEEFIAHARGR1^KELRNIjEAELGPSPPAPDVL.EFTDHG\S 
SGFVGGLCQVAAAYQELFAAQGPAGAEKLAAFARQLGSRYFALV 
ERRLAQEQGGGDNSLLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLQGLRAAFLGCLTDVRQALAAPRVAGKEGP 
GXJ^LLANVASSILSHIKASIjAAVHLFTAKEVSFSNKPYFRGEF 
CSQGVREGLI VGFVHSMCQTAQSFCDS PGEKGGATP PALLLLLS 
RLCI^YETATISYILTLTDEQFLVQDQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRLAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCIWASHGASSVARASVREPQGNKSPRMNTKRAGECIiCPRS 
CSFS AQDYDI FAP ILPVEKQRLRVTQEVRAGI^VLVLKI RPQTNS 
CILPLPHSTGS INSDHVPTK 


6148 


3056 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVAIjTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSQMDQKPTKTPGRT 
SSTLSEDQMSRLAKLTKAHRQGHMVKVDWIiDRLTFREIEMINES 
VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNQDKALTKILTSVIW 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVT^R 
LRQADDEDLLMYLLQLVQALKYENFDD I KNGLEPTKKDSQSS VS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFLISRASKNSTLAWYLYWYVIVECEDQDTQQRDPK 
THEMYLNVMRRFSQALLKGDKSVRVMRSLIiAAQQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELIPLPLEPQVK 
IRGIIPETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDLRQD 
QL I LQ 1 1 SLMDKLLRKENLDLKLTP YKVLATSTKHGFMQF I QS V 
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SEQ 
ID 
NO: 


Predicted 
becrinnincr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

1 nufl pnt" i Hf^ 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
I sequence 


Amino acid segment containing signal peptide 

t-=LysEeine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 

W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop j 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








P VAEVLDTEGS IQNFFRKYAPSENGPNG I S AEVMDT YVKS CAG Y 
CVITYILGVGDRHLDNLLLTKTGKLFHIDFGYILGRDPKPLPPP 
MKLNKEM VEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNL I LNLF 
SLMVDANI PDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSIiID 
ESVHALFAAWEQIHKFAQYWRK ** | 


6149 


1 

• 


1413 


RVDPRVRENGTANP I KNGKTS PAS KDQRTGKKTS VQGQVQKGND 
ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 
^HKPJbljMDSEDEBEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS ! 
GPTQDLNTILLTSAQLSSDVAVETPKQEFDVFGAVPFFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
vkji^r^\rii afiji VUvr(joJLfc'J?yi'r LiTSTSKSESNEDLiFGIjVP 

FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
STKKTLKPTYRTPERARRHKKVGRRDSQS SNEFLTISDS KENI s 
VALTDGKDRGNVLQPEESLLDPFGAKPFHSPD\LSWHPP\HQGL 
S \D IRADHNT\ VLPGR\ PRQNSLHGS FHSADVLKMDD FGAVP / F 
LTELWQS ITPHQSQQSQPV\ELDPFGAAPFPSKQ | 


6150 


372 


37 


MSNI KKY I IDYDWKAS IE I EIDHDVMTEEKLHQINNFWSDSE YR 
LNKHGS VLNAVL IMLAQHALL I AI SSDLNAYGWCE FDWNDGNG 


6151 


1555 


521 


DSNQQS VSGTAASTLfcHS FKATI Y YQGTGHVQQF YGVTS P YSQT| 
TPPIVQSYAQPSLQYIQGQQIFTAHPQGVWQPAAAVTTIVAPG 
QPQPLQPSEMVVTNNLLDLPPPSPPKPKTIVLPPNWKTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASLEHEAEMDLGTPTYDEN J 
PMKXASKKPKTAEADTSSEIAKKSKEVFRKEMSQFIVQCLNPYR 
JxfUL-Jv vtj \K1 111 tLUb xUiliAKKij 1 HG VMNKBJjKYCKNPE \DLEC 1 
NENVKHKTKEYIKKYMQKFGAVYKPKEDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGBFNPADVN | 


6152 


1366 1 

5 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCX3AVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTETGCRCDAGOTGSNCSEECPLGWHGPGCQRPC | 
KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGELSFFTRTAW 
IiALTIiAtAFLLLISTAANLS LLLSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD ; : | 


6153 


2 | 


3368 


GRVGARS PGRAYALLLLL I CFNVGSGLHLQVLSTRNENKLLPKH 
PHLVRQKRAWITAPVALLEGEDLSKKNPrAKIHSDIiAEERGLKI \ 
TYKYTGKGITEPPFGtFVFNKDTGELNVTSILDREETPFFLLTG 
YALDARGNNVEKPLELR I KVLD INDNE PVFTQDVFVGSVEELS A 

AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT ! 
GEIYTTSVTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 
I LDVNDNI P WENKVLEGMVEENQVNVE VTRI KVFDADE IGSDN 
WLANFT FASGNEGG YFH IETDAQTNEG I VTLI KEVD YEEMKNLD 

fsvtvankaafhks irskykptp i pi kvkvknvkegihfkssvi 
s i yvs e s mdrss kgq 1 1 gnfqafdedtgl pahar yvkledrdnw 
isvdsvtseiklaklpdfesryvqngtytvkivaisedyprkti 
tgtvlinvedindncptliepvqtichdaeyvnvtaedldghpn 
sgpfsfsvidkppgmaekwkiarqestsvllqqsekklgrseiq 
flisdnqgfscpekqvltltvcevlhgsXgcreaqhdsyvglgp 

AAIALMI LAFLLLLLVPLLLLMCHCGKGAKGFTP I PGTIEMLHP 
WNNEGAPPEDKVVPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 
S AS I VKGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI \MTTE 
TTITARATGASRDVAGAQAAAVALNEEFLKNYFTDKAASYTEED 
ENHTAKDCLLVYSQEETESLNASIGCCSFIEGELDDRFLDDLGL 
KFKTLAEVCLGQKIDINKEIEQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSQPQSLIV 
TERVYAPASTLVDQPYANEGTWYTERVIQPHGGGSNPLEGTQH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment, containing signal peptide 
(A«Alanine, CeCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 anine , G=Glycine r 
n— nisciame , i=iso±eucine, A-iiysiue, 
L= Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQDVP YVMVRERES FIiAPSSGVQPTLAMPN IAVGQNVTVTERVL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLEESGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


2146 j 


KKKTKMKNTLQKTVNFGAWPKPTISDKSHLLQMVSKLDLTDAKN 
SDTAHI KS IEITS ILNGLQASESSAEDSEQEDERGAQDMDNNGK 

nn r> t^ttt rrrvrkmxTT'vT r\ Y^T^T^ r% T T T'M <^TT./T TT T Tv TNT tTT^ T^T\f TCT/" 

EE S KI DHLTNNRNDL I S KEEQN S S S LLEENKVHADLV I S KP V S K 
S PERLRKD IEVLSEDTD YEE DEVTKKRKDVKKDT TDKS S KPQI K 
RGKRRYQJTEECIiKTGSPGKKEEKAKNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSEVAEKRIKLL 
NNSDERLQNSRAKDRKDVWS S IQGQWPKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 

RQQSSVTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGELQDIiQS 
ERE * IiASRF * CQCELEQ * + SARTRTS * KSLYRSEKSERCSGRRK 
FI KKAEKKP * SNSGKQQKEGK 


6155 


869 


121 


HLLPELRGKSWITMKYVFYLGVLAGTFFFADSSVQKEDPAPYLV 
YLKSHFNPCVGVLI KPSWVIoAPAHCYLPlSrLKVMLGNFKSRVRDG 
TEQTINPIQIVRYWNYSHSAPQDDLMLIKLAKPAMLNPKVQALN 
P\ PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNIjEAPVMSDRE 
CQKTEQGKSHRNSLCVKFVKVFSRI FGEVAVATVICKDKLQGIE 
VGHFMGGDVGIYTNVYKYVSWIENTAKDK ! 


6156 

• 


5725 


3984 

m 

■ 


GTSTVTMATKKHFS 1 1 LNIiLGMLLKKDNQDTRKLLMTWALEVAV 
VMKKSETYAPLFCLPS FHKFCKGLIiADTLVEDVN I CLQACSSLH 
ALS S S LPDDLLQRCVDVCRVQL VHRGTCI RQAFG KLL KS I PLGV 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY 
GNSHRTGKDNWLERLFYSCQRLDKRDQSriPRNLLKTDAVLWQW 
AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGI I RS LAGHTLNPDQ 
DVSQWTTADNDEGHGNNQLRLVLLLQYLENLBKLMYNAYEGCAN 
ALTSPPKVIRTFLYTNRQTCQDWLTRIRLSIMRVGtiLAGQPAVT 
VRHG FDLLTEMKTTS LSQGNE LEVS IMMWEALCE LHCP EAI QG 
I AVWS S S I VGKHLLWINS VAQQAEGRFE KASVE YQEHLCAMTGV 
DCCI S SFDKSVLTIiASAGCKSASLKHCLNGESRKS VLSKPTDSS 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKADFNYI KSLSSFESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIIIiQCAEDIEH 
PPPGRAHFQKWLMDGTVLCKLINSIjYPPGQEPIPKISESKMAFK 
QMEQ I S QFLKAAET YGVRTTD I FQTVDLWEGKDMAAVQRTLMAIi 
t»bVAV L AJJUuL. iKbiiroWrnKWiyuiNKKvjr oilisu^^U^^W-" v - L » jlJ 
QMGSNKGASQAGMTGYGMPRQ I M* DAASCP 


6158 


441 


1482 


LGSLIVLSLHCKVIFSSQSLERAMKEKAVDLVPIIiAQNPGLAQN 
P ILEGKDHNQNTGVDPI I DHVQDRKTD/SRSKS PHKKRS KSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDREKDKEKDREREREKEHEKDRDKEKEKE 

RRRRRSRSSSRS PRTSKTI KRKSSRSPSPRSRNKKDKKREKERD 
HISERRERER5TSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SSVSKEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


6159 


53 ■ 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
IITGTPILTFVKDPQLEVNFYTGMDEDSDIAFQFRLHFGHPAIM 
NSCVFGIWRYEEKCYYLPFEDGKPFELCIYVRHKEYKVMVNGQR 
I YNFAHRFP PAS VKMLQ VFRD I S LTRVL I S D * GRC VR I TAVQE F 
DVS VSCDCTTAYQ PG 


6160 


1626 


1790 


AGAKFFP * F * KVADAQPTE SEKE I YNQVNWLKDAEG I LEDLQS 
YRGAGHEIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 

nucleotide 
I location 

corresponding 
I to first 

amino acid 
I residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine # 
H»Histidine, I«=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, Vs=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


r 






LEAALRGLLGALTS TPYS PTQHLEREQALAKQFAE I LHFTLRFD 
E LKMTNPA IQND FS Y YRRTLS RMR INNVPAEGENE VNNE LANRM 
SLFYAEATPMLKTLSDATTKFVSENKNLPIENTTDCLSTMASVC 
RVMLETPEYRSRFTNEETVSFCLRVMVGVI I L YDHVHP VGAFAK 

TSKIDMKGCIKVLKDQPPNSVEGLLNALRYTTKHLNDETTSKQI 
KSMLQ*QLLTLVNKG 


6161 


455 


1569 


PVSGSESSLRRAWASILRLMLGPRVAVSILCEDGISH*LLEKH* 
KSHVLEPLSSLALEEQCIiALSLDWSTGKTGRAGDQPLKIISSDS 
TGQLHLLMVNETRP RLQ KVAS WQAHQFEAW IAAFNYWHPE I VYS 

GGDDGLLRGWDTRVPGKFLFTSKRHTMGVCSIQSSPHREHILAT 
G S YDEHI LLWDTRNMKQ PLADTP VQGG VWR I KWHPFHHHLLLAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSIi 
LATCS FYDHALHLWE WEGN 


6162 


1 


586 


RT I HATGRAG AS PMHRL I VW RLAE AN KQH VRC QKCLE FG H WTYE 
CTGKRKYLHRPSRTAELKKALKEKENRLLLQQSIGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSKSEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE IELLHS YWTDGLKTLM 


6163 


1081 


785 


R I RSTTEGCAVRLHPTQNTGKARIM I LLSVSLGRHWAFTYKFFL 
TPVVFVFFFFFFHRKE*VMQKNPMKSREDEWMEKLNNLHVQRAD 
MNRL I MNYLVTEG FKEAAEKFRMESG I EPSVDLETLDER I KIRE 
MILKGQIQEAIALINSLHPELIjDTNRYLYFHLQQQHLIELIRQR 
ETEAALEFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVWSEWQAVLDYENRESTPKLAKLLKLLLWAQN 
ELDQKKVKYPKMTDLSKGVIEEPK 


6164 

* 

1 ' 


90 

f 

* 1 


406 


PCQS PGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTI LSNVLKKR 

SCISRTAPRLLCTLEPGVDTKIjKFTLEPSLGQKGFQQWYDALKA 

VARLS TG I PKE WRRKVWLTLADHYLHS I AIDWDKTMRFT FWERS 

NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRVVLKRVLLAYAR 

WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVLPES 

YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 

G YEPPLTNVFTMQWFLTLFATCIiPNQTVLBaWDSVFFEGSEI IL 

RVSLAIWAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHE 

LMQTVYSMAPFPFPQIaAELREKYTYNI TPFPATVKPTS VSGRHS 

KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPBLQKYQKQIKE 

PNEEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 

ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 

HLLLGKKMKMTNRAAKNAVI H I PGHTGGKI S PVP YEDLKTKLNS 

P WRTHI RVHKKNM PRTKSHPGCGDTVGL I DE QNEAS KTNGLGAA 

EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 

VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 

S KAPQGSNS KTPI FS P FPSVKPLRKSATARNLGLYGPTERTPTV 

HFPQMS RS FS KPGGGNSGP * KMVFSSGTMLS RQLPGYPQEYQRN 

GGERFG 


6165 


90 


405 


PCQSPGRSRMRQDKLTGSURRGGRCLKRQGGGVGTIIiSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKE WRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NP DDDSMG I QI VKDLHRTGCSS YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFWILAALILEVMEGNEGDALKIMIYLIDKVIiPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVS LAI WAKLG EQ I E CCETAD E F YSTMGRLTQE MLENDLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDS DE ENDPDDEDAWNAVGCLG P FSG F LAPE LQK YQ KQ I KE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /»=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSIiRSNN IAELS PGAINSCRSE YHAAFNSMMMERMTTDIN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEP PSAPEENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP * KMVFSSG1WLSRQLPGYPQE YQRN 
GGERFG 


6166 


2 


1206 


HKLWRT VAMAGAEWKS LE E CLEKHLP LPDLQEVKRVLYGKELRK 
LDLPREAFEAASREDFELQGYAFBAAEEQLRRPRIVHVGLVQNR 
IPLPANAPVAEQVSALHRRI KAI VE VAAMCGVN 1 1 CFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTRFCQKLAKNHDMVWSPIIiE 
RDS EHGDVLWNTAVVI SNSiSAVLGKTRKNHI PRVGDFNESTYYM 
EGNLGHPVFQTQ FGRIAVNI CYGRHHPLNWLMYS INGAE 1 1 FNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGLSRSRDGLLVAKLDL 
NLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYSPTIVKE*PAS 
VPALG 


6167 

* 


1220 


1844 


YG I VTGPSLCAGDKQPKKQE KNPVLVS PEFVDEALCACEE YLSN 
LAHMDIDKDLEAPLYLTPEGWSLFLQRYYQWHEGAELRHIjDTQ 
VQRCEDI LQQLQAWPQ I DMEGDRN I WI VKPGAKSRGRG I MCMD 
HLEEMbKLVNGNPWMKDGKWWQKYIERPLLIFGTKFDLRQWF 
LVTD WNPLTVWF YRDS Y I RFSTQPFSLKNLDK* APLYLTP EGWS 
LFLQRYYQWHEGAELRHLDTQVQRCEDILQQLQAWPQIDMEG 
DRNIWIVKPGAKSRGRGIMCMDHLBEMLKLVNGNPVVMKDGKWV 
VQKYIERPLLIFGTKFDLRQWFLVTDWNPLTVWFYRDSYIRFST 
QPFSLKNLDK 


6168 


84 

• 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 
ELNELFKPVVAAQKISKGADPKSWCAFFKQGQCTKGDKCKFSH 
DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEVVNKKHG 
E AEKKKPKTQ I VCKHFLEAI ENNKYGWF WVCPGGGD I CMYRHAL 
" P PG F VLKKKKKKKKKEDE I S L * DL I ERERS ALGPNVTKI TLES F 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDEVDDSVSVNDIDLSLYIPRDVD 
ETGITVASLERFSTYTSDKDENKLSEASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPE DLNLPNAV I TR 1 1 KEALPDG VN I S KEARSAI S R 
AASVFVLYATS CANNFAMKGKRKTLNASDVLSAMEEME FQRFVT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSEEQDKSRDEDN 
DEDEERLEEEEQNEEEEVDN*KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 

• 


667 


STKVMLPNTGRLAGCTVFITGASRGIGKAIALKAAKDGANIVIA 
AKTAQPHPKLLGTIYTAAEEIEAVGGKALPCIVDVRDEQQISAA 
VEKAIKKFGGIDILVNNASAISLTNTLDTPTKRLDLMMNVNTRG 
TYLASKACI PYLKKS KVAHI PNI SPPLNLNPVWFKQHCGRW* W 
G * GDGLCLI CFELNLCMSDVITI CT 


6171 


382 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQS IQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGD I FGDS FAAYFPRVLKQVHQALSLSQEAVSVMDSMVRD I LD 
R IATEAGHIiAHYSKCVTITSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
S AQERKERLRRALE ENRL I P TELRREALALQGSLE FDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKIiVFPGA 
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SEQ 
ID 
NO: 


Predicted 
beginnincr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence * 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

V" nj.auxuc / \-=*\*y OLcillc f U-nSpairUlC AC1Q, £•= 

Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine T= Threoninp \7^\i^ l i'tio 

W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRMNRGRHEVGALVRAC3CANGVTDLLVVHEHRGTPVGLIVSHLP 
FGPTAYFTLCNVVMRHDI PDLGTMSEAKPHL I THGFS SRLGKRV 
SDILRYLFPVPKDDSHRVITFANQDDYISPRHHVYKKTDHRNVE 
LTEVGPRFELKLYM I RLGTLEQEATADVEWRWHP YTNTARKRVF 
LSTE *AAPRP LGOT.T. 


6173 


3 


288 

• 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQRIHTGEKPYVCSVCX3KAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
RIHTGERPYVCPLCGKAFNHSTVLRSHQRVHTGEKPHRCNECGK 

GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHL I QHQKVHRKL* PTCVLS VGSALAGVPTS FS IS VSTLERS P 
MCAVYVGRPSARAQSIjVNTGQFTQVRSPMSVMSVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLCKJIiARRIjGVAESWT 
RDRHCAADLAIiAPLGDAQLVLLRPRRLMNANGRSVARAAELFGL 

taeevylvhdeldkplgrlalklggsarghngvrscisclnsna 
mprlrvgigrpahpeavqahvlgcfspmqellpllldratdli 
ldhirersqgpslgp+h*wfskka 


6175 


2204 


334 

• 


ryfradprsrsgqpraegdgafaegplramaapvkgnrkqsteg ; 
daldppaspkpagkqngiqnpisledspeaggereeeqereeeq 
aflvslykfmkerhtpiervphlgfkqinlwkiykaveklgaye s 
lvtgrrlwknvynelggs pgstsgatctrrhy* RLVLPYVRHLK ; 
geddkplptskprkqykmakenrgddgaterpkkakeerrmdqm 
mpgktkadaadpaplpsqepprnsteqqglasgssvsfvgasgc 

■fiLi-k.1 ivx.ijij^oi' i Urvvj 1 HvslMolHJjAitiUiU-LLjAQVSK v EAIjQCQEEG 

CRHGAEPQASPAVHLPESPQSPKGLTENSRHRIjTPQEGLQAPGG 

slreeaqagpcpaapi fkgcfythptevlkpvsqhprdffsrlk 
dgvllgppgkeglsvkepqlvwggdanrpsafhkggsrkgilyp 
kpkacwvspmaktpaesptlpptfpsspglgskrsleeegaahs 

YRGTMIjHCPLNFTGTPGPLKGQAALPFSPLVIPAFPAHFLATAG 
PS PMAAGLMH FPPTS FDS ALRHRLCPAS SAWHAP P VTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDS FGRCQLAG YGFCHVPSS PGTHQLACPTWRPLGS WREQLAR 
AFVGGGPQLLHGDTI YSGADRYRIjHTAAGGTVHLE IGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHC YLDO I KRS DFLGFSG YS PHFVAI STNSRHKMO P SSMOf) AT . 

PSQ*PYWTDPRPAIiVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTAALAHGCIi 
HCHSNFS KKFS FYRHHVNFKSW WVGD I PVSGALI/row QDnTMTf V 

LH^IPAKITREfOiDQVATAVYQM^QLYQGKMYFPGYFPNELR 
NIFREQVHLIQNAIIESRIDCQimaSIFQYETISCraCTDSHVA 
CFGYNCESSAQWKSAVt^LLOTINNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCLEPPHLANLSLEDAA*CLKQH 


6179 


806 


276 


RGETREMAGNIJ^SGAGRRLWDWVPIiACRSFSIjGVPRLIGIRLTL 
P P P KVVDRWNEKRAM FGVYDN I G I LGNFEKHPKEL IRGP I WLRG 
WKGNE LQRC IRKRKMVGSRMFADDLHNLNKRI RYLYKHFNRHGK 
FR*rCRKLRTSEKAHLSPWRRETVLFPVRKRLCIFSVIKWGFFGI 


6180 


156 


1833 


DHH ILKAASTTHVCARGN I FAI PNTRCLEC * ATATPS S LECQN* 
SHLSLCPLPATTSGLTPNSMIPEKERQNIAERLLRVMCADLGAL 
SYVSG KE FLKLAQT L.VDSGAR YGAFS VTE ILGNFOTLA1»KHLPR 
MYNQVKVKVTCAMSNACXGIGVTCHSQSVGPDSCYILTAV'QAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W = Tryp t ophan , Y«Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNHIKSYVIiGVKGADIRDSGDLVHHWVQNVLSEFVNSEIRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 
VIEIjLNVCEDLAGSTGLAKBTFGSLEETSPPPCWNSVTDSLLLV 

heryeq i cefysrakkml^ iqslnkhllsnlaailtpvkqavi e 
lsnesqpti^lvlptyvrleklftakandagtvsklchlfiieal 
kenfkvhpahkvamildpqoklrpvppyqheeiigkvcelinev 
keswaeeadfepaakkprsaavenpaaqeddrlgknevydylge 
plfqatpdlfqywscvtqkhtklaklafwllavpavgarsgcvw 
mceqall i krrrlls pedmnklmflksnml 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLHIYVGIESNHLLPRFLQLTERIIIliFWITSQEEVQE 
KYWCVLFVFWNLLDMVRYTYSMLSVIGISYAVLTWLSQTLWMP 
IYPLCVLAEAFAIYQSLPYFESFGTYSTKLPFDLSIYFPYVLKI 
YLMMLFIGM YFTYSHLYSERRDI LGI FP I KKKKM* STAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 

- 


AS * IDYQLNTLLKEFQLTEENTKLRYLTCSLIEDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERIATQKILSVLGECLDHFGPGCVGVQKILNARCPLVR 
FSHQASGFQCDLTTONRIALTSSELIiYIYGALDSRVRALVFSVR 
CWARAHSLTS S I PGAW I TNFSLTMMVI FFLQRRSPP ILPTLDSL 
KTIiADAEDKCVIEGNNCTFVRDLSRIKPSQNTETLELLLKEFFE 
YFGNFAFDKNS INIRQGREQNKPDSSPLYIQNPFETSLNI SKNV 
SQSQLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRKS FTKKKSNKFAI ETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 

GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 

RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 

GSCGCSQSSCCKPCCCSSGCX3SSCCQSSCCKPCCCQSSCCVPVC 

CX3SSCCKPCCCQSNCCTPVCC^CKI*GSGPRPSGFSCLVKAFL^ . 

VP 


6184 


1 


2191 


I VTVREEDGAPAVAP PGWVS RANKRSGAG PGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRI ITS 
ELYRSLGDVLRDVDAKALVRSDFLLVYGDVISNINITRALEEHR 
LRRKL*K3WSVMTMIFKESSPSHPTRCHEDNVVVAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGS SDGVEVRYDLLDCHI SICS PQVA 
' QLFTDNFDYQTRDDFVRGLLVNEE ILGNQ IHMHVTAKEYGARVS 
NLHMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
P E VS LGHGS I LEENVLLGS GT VIGSNCF I TNS VI G PGCH I EPGD 
NWLDQTYLWQGVRVAAGAQIHQSLLCDNAEVKERVTLKPRSVL 
TSQVVVGPNITLPEGSVISLHPPDAEEDEDDGEFSDDSGADQEK 
DPCVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKI 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
GKEEN I S CDNLVLE INSLKYAYNISLKEVMQVLSHWLE FPLQQ 
MDS PLDS SR YCALLLPLLKAWS PVFRN YI KRAADHLEALAAI ED 
FFLEHEALGISMAKVLMAFYQLEILAEETILSWFSQRDTTDKGQ 
QLRKNQQLQRFIQWLKEAEEESSEDD 


6185 


791 


44 


PCTSCVLWATLHIiPASTRKAPQAECGMISITEWQKIGVGITGFG 
I FFILFGTLLYFDSVLLAFGNLLiFIjTGLSLI IGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI pflgalfrrlqgtssmv* KTEMSSLNLDHWLKGAK 
REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 
GCQEAEMQTPRRLGWGMYHTLTLYLWEEK 


6186 


569 


238 


V YG IDS S NTNTHGAE E RNR KLKKHWKLCHAQ S RLDVNGLAL KMA 
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SEQ 
ID 

| NO: 
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Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid e?F» cxmf^TI t~ nnnhaininn Birmal nonh^A 

(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutaraine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X^Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








ivcevInv jyin iv v zunjvhjji c.n v ELiN&r 1 WvcivlTJirloAXiiirl/r buovlb 
NIRNQI^TLHSQPHQEENLCFENSFSIjINLLPINAVEPTSSQQI 
PNRETSEANKERRKMTSKSSESNIYSPLTSFITADSELHDIIKD 
LEDCLIWGLHTCGDLAPNTLRI FTSNS E I KGVCSVGCCYHLLSE 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRN7\RMSACLALERV 
AAGQGLPTES LFYRAVLQD 1 1 KDC YGI TKCDRHVGKI YSKCS S F 
LDYVRRSLKKLGLDES KLPEKI IMNYYEKYKPRMNELEAFNMLK 
VVLAPCIETLILLDRLCYLKEQEDIAWSALVKLFDPVKSPRCYA 

PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA 
AAPLMERKFHVLVGVTGSVAALKLPLLVSKLLDIPGLEVAWTT 
ERAKHFYSPQDIPVTLYSDADEWEMWKSRSDPVLHIDLRRWADL 
LIjVAPLDANTLGKVASG I CDNLLTC VMRAWDRS KPLL FC PAMNT 
AMWEHPITAQQVDQLKAFGYVEIPCVAKICLVCGDEGIjGAIvIAEVG 

tivdiwkevlfqhsgfqqs*pgisvmgvplysewvqaksvkmdv 
gkiggyphllnggpalslprgqacsrlnwtegpglsffqpgeaa 

A 


6188 


238 


1 1534 

JL J *z 

* t 

« 


nigvficircagihrnlgvhisrvksvnldqwtqeqiqcmqemg 
ngkanrlyeaylpetfrrpqidpavegfirdkyekkkymdrsld 
inafrkekddkwkrgsepvpekinjep\a^ekvkmpqkkedpqlp 

RKS S PKSTAPVMDLLGLDAPVACS I ANS KTSNTLEKDLDLEiASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QLSKX>SILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNSIMGSMMPPPVGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTQQAGYT^GMAAMPQTV^GVQPAQQLQWl^TQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLS PQMWK 


6189 

W J» W «r 


1297 

■" i 
•j 


TOT 

• 


T/"ST?DT.r3F1T.r , TI*T.T D/in^/rvAT ^\W/^T?\7TJT50 r TV^ Ti Q 7V A O t T7v /"* T? T T/^T 
lAjCi Ir IAjJJIj UAJj i fou V s^v Vrilr o iAjAy uo Att\Jo vA\jCt Vyjj 

TQLSHARQRPSCQGSQLIALDWHMDISRQPRWQHVQPVARQVQ 
RAQQAQLAEGYAVHLWAGDAVVAEVELLQEVGGG KVFAANACDL 
WQDHEGAJElAARQATGHAIjQRVI VQVRR VQPLEAIi* R VPS GLPR 
RVRAFMILHNQITGIGREDFATTYFLEEIiNLSYNRITSPQVHRD j 
AFRlvliRLLRSLDLSGNRLHMLPPGLPRNVHVLKVKRNE 
GA1»AGMAQLRELYLTS1^LRSRAIX3PRAWVDLAHLQLLD I AGNQ 
LTEI PEGLPESLEYLYLQNNKISAVPANAFDSTPNLKGI FLRFN 
KLAVGS VVDSAFRRLKHLQVLDIEGNLEFGD I S KDRGRLGKEKE 
EEEEDEVEEEETR 


6190 


66 


1309 


ILVGNVSFLLSFAE YVCNCS WGS TiNVWRPNOTTGOrE PR PfJVn 
GLHCETCKEGF YLNYTSGLCQPCDCS PHGALS I PCNSSGKCQCK 
VGVIGS ICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNHCE E CKEG F YO S PDAT KE CLRCP CS AVTSTG S CS I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHDLEGNCIK 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTSALADVSWTQFNI IILTVI 1 1 VWLLMG FVGAVYM YRE 
YQNRKLNAPFWT I ELKEDN I S FSS YHDS I PNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


TOLCHGGLIxHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDWIKKIWYIYTMEYYATIICRNEIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 

* 


950 


TRGCGNKMAGKKNVLSSLAVYAEDS EPESDGEAG I EAVGSAAEE 
KGGLVSDAYGEDD FSRLGGDEDGYEE EEDENSRQSEDDDSETE K 
PEADDPKDNTEAEKPJDPQELVASFSERVRNMSPDEI KI PPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCA I DE LGTN Y P KDMFDPHG W S EDS Y YE ALAKAQK I EMD KLE K 
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ID 

NO: 


ricuitLcu 

beginning 
nucleotide 

locafc i rm 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 

nrivro coon Hi' nrr 

to first 
amino acid 
residue of 
amino acid 
sequence 


/unmo acia segment concaining signal peptide 
(A=Alanine, C=Cysteine, D=sAspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 

HaHlfl)"iHina T-T«n1 pnn-lnp V— T \ro i no 
n B iiiDL J.UJ.HC / Jjy fc> 1 He , 

L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=0nknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKBRTKIE FVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
Do Ai f V 1 1 1 AQPT I iiTl iAlJjPAV VI VI 1 SASGS KTT VI S AVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAG IEAVGS AAEB 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PE ADDPKDNTEAEKRDPQELVAS PSERVRNMS PDEIKI PPE PPG 
RCSNHLQDKIQKLYERKIKEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEAIiAKAQKIEMDKLEK 
AKKERTKI E FVTGT KKGTTTN ATS TTTTTAS TAVAD AQ KR KS KW 

DoAlirV 1 1 lAyPl IIil 1 lAiJbJrAVV 1 V 1 lfaAMabK.1 1 VlbAVtsA 

IVKKAKQ 


6194 


3 


950 


TRGCGNKMAGKKNVL S S LAVYAED S E PES DGEAG I EAVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
FciAUJJt'rdJiN 1 £jAejX\-KU r*y £iJj V AS r iaoK vKJNMoyUr»ll\lrr JSirlrVj 
RCSNHLQDKIQKL YERKI KEGMDMNY 1 1 QRKKE FRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDS YYEALAKAQECI EMDKLEK 
aKKERTKIEFVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DS AI P VTT IAQP T ILTTTATLPAVVT VTTSASGS KTTVI S AVGT 
IVKKAKQ 


6195 


736 


235 

* 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKENVKD 
X iy KWnbbSJAybblUKl L AAr yyvjrs.lF r IFr faAPPPAGAMIPPP 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISNLEARNLGPRLTPLLQEEDSH 
QRLLMGLMVS ELKDHFLRHLQGVEKKKI EQMVLDYI SKLLDL I C 
HIVETNWRKHNLHSWVLHFNSRGSAAEFAVFHIMTRILEATNSL 
FLPLPPGFOTI4HTILGVQCLPLHNLLHCIDSGVLLLTETAVIRL 
MtUJJjUN raKWcKIjKFSI I VRLPPLIGQKICRLwDHPMSSNIISR 
NHVTRLLQNYKKQPRNSMINKSSFSVEFLPLNYFIEILTDIESS 
NQALYP FEGHDNVDAE FVE EAALKHTAMLLGL 


6197 

* 


3 


819 


ADPEGTEEAVMSRYTRPPNTSLFIRNVADATRPEDLRREFGRYG 
P I VDV£ I PLDF YTRRPRG FAYVQ FEDVRDAEDAL YNLNI^WVCG 
RQIEIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALS PS FISPACFLLRKLPALEDGTLPHPDTLGMTJYEGARSE 
RENHAADDS EGGALDMCCS ERLPGLPQ P I VMEALDE AEGLQDSQ 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHI WSQNATNLVS S LLTLLKQLEPTAWLDSGTW 

SRQPSFTYSEWMEEKIEDDFLDLDPVPETPVFDCVMDIKPEADP 
TSLTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREES 
AREYLLSASRVLQAEELHEKALDP FLLQAEFFS I PMNFVDPKE Y 
DIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRG 
YGGEEKVYIATOGPI VSTVADFWRMVWOEHTPI IVMITNI EEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAGIGRTGCF I ATS I CCQQLRQEGWD ILKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQSPE 


6199 


144 


1211 


MARENGES S S S WKKQAED I KK I FE F KETLGTGAFS E WLAE EKA 
TGKLFAVKC I P KKALKGKESS I ENE IAVLRKI KHENI VALEDI Y 
ES PNH L YLVMQLVSGGE L FDR I VE KG FYTE KDAS TLIRQVLDAV 
YYLHRMG I VHRDLKPENLLYYS QDE E S K IM I S DFGLSKMEGKGD 
VMSTACGTPGYVAPEVLAQKPYSKAVDCWS IGVIAYILLCGYPP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicmal peptide 1 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, \ 
P=Proline, Q=Glutamine, R=*Arginine, j 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 








FYDEND S KLFEQ I LKAEYEFDS PYWDDI SDSAKDFI RNLMEKDP 

NKRYTCEQAARHPWIAGDTAIiNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAVVRHI^KIjHIXSSSIjDSSNASVSSSLSLASQKDCASGTF 
HAL* 


6200 


702 


96 


LPEVPHSLRPRVKPHLCCAQPAVRVMARLPKLAVFDLDYTLWPF 
WVDTHVDPPFHKSSDGTVRDRRGQDVRLYPEVPEVLKRLQSLGV ( 
PGAAASRTSEIEGANQLLELFDLFRYFVHREIYPGSKITHFERL 
QQKTG I P FSQMI FFDDERRNI VDVSKLGVTC IHIQNGMNLQTLS 1 
QGLETFAKAQTGPLRSSLEESPFEA j 


6201 


2809 


2383 


GQTPRVRWKMRRSLRAGKRRQTAGRKSKSPPKVPIVIQDDSLPA 

GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYAEA 

RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


! 2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE 

DGIFDSGNFEQFLREKVKVNGKTGNLGNWHIERFKNKITWSE 

KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDESESED | 


6203 


419 

■ * • I 
* ' * i 


2550 

T 


RCP R PPATAGAAAS R P DRS P P S G I SGS EAAAGAGAAAPAS QHPA 
TGTGAVQTEAMKQ I LGVI DKKLRNLEKKKGKLDD YQERMNKGER 
LNQDQLDAVS KYQEVTNNLEFAKELQRSFMALSQDIQKTI KKTA '1 
RREQL^EEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 
PILSEEELSLLDEFYKLVDPERDMSLRLNEQYEHASIHLWDLLE 
GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEEEEAA 
SAPAVEDQVPEAEPE PAEE YTEQS EVESTEYVNRQFMAETQFTS 
GEKEQVDEWWETVEVVNSLOOOPOAASPSVPEPxISLTPVAOAn 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPAIVSAQPM 
NPTQNMDMPQLVCPPVHSESRLAQPNQVPVQPEATQVPLVSSTS 1 
EGYTASQPLYQPSHATEQRPQKEPIDQIQATISLNTDQTTASSS | 
LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFNMNAPVP i 
PVNEPETLKQQNQYQAS YNQSFSSQPHQVEQTELQQEQLQTVVG 
TYHGS PDQSHQyTGNHQQPPQQNTGFPRSNQPYYNSRGVSRGGS 
RGARGli^GYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 1 

RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 1 
NTQQVN J 


6204 


2933 


787 


CTHNLISLLGGRALIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV " 
PGDIIKSWSKEMDKRYLQFDIKAFVENNPAIKWCPTPGCDRAV j 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECLGEAHEP j 
CDCQTWKNWLQKITEMKPEELVGVSEAYEDAANCLWLLTNSKPC 
ANCKS PIQKNEGCNHMQCAKCKYDFCWI CLEEWKKHS FVHWE VI | 
YRCTRYEVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYQLEQRLLKTAKEKMEQLSRALKETEGGCPDTTFIEDAV | 
HVLLKTRRILKCS Y P YGFFLE P KSTKKE I FELMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQEFLAS VARGVAPADSP | 
EAPRRS FAGGTWDWE YLGFAS PEEYAE FQYRRRHRQRRRGDVHS | 
LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
SLRDYTPASRSENQDSLC3ALSSIiDEDDPNILLAIQLSLQESGLA 
LDEETRD FLSNEASLGAIGTS LPSRLDS VPRNTDS PRAALS SS E ) 
LLELGDSLMRLGAENDPFSTDTLSSHPLSEARSDFCPSSSDPDS 
AGQDPNI NDNLLGN I MAWFHDMNPQS IAL I PPATTE I SADSQLP 

CIKDGSEGVKDVELVLPEDSMFEDASVSEGRGTQIEENPLEENI 
PGGGKQHPQAW 


6205 


1 | 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDMTVAPSDRPLQLPKVLGGD 
S AMRAFQNTATACAP VSHYRAVES VDS SEES FSDS DDDS CLWKR \ 
KRQKCFNPPPKPEPFQFGQSSQKPPVAGGKKINNIWGAVLQEQN 
QDAVATELGILGMEGTIDRSRQSETYNYLLAKKLRKESQEHTKD \ 
LDKELD E YMHGG KKMGS KEE ENGQGHLKR KR P VKDRLGNRPEMN 
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SEQ 
ID 

NO: 

• * 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=? 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






■ 


YKGRYE ITAEDSQEKVADE I S FRLQE PKKDL I ARWRI IGNKKA 
IELLMETAEVEQNGGLFIMNGSRRRTPGGVFLNLLKNTPSISEE 
Q I KD I FYIENQKE YENKKAARKRRTQVLGKKMKQAI KSLNFQED 
DDTSRETFASDTNEALASLDESQEGHAEAKLEAEEAI EVDHSHD 
LDIF 


6206 


10 


1442 


IISERRERSCLHLVCIRCSCDWEMGSVLGLCSMASWIPCLCGS 
APC]XCRCCPSGimSTVTRLIYALFLLVGVCVACVMIjIPGMEEQ 
LNKI PGFCENEKGWPCN I LVGYKAVYRLCFGLAMFYLLLSLLM 
I KVKS S SDPRAAVHNGFWFFKFAAAIAI I IGAFFI PEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLS ATALNYLLS LVAI VLFFVYYTHPAS CS ENKAF I S VNMLLC 
VGASVMSILPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPSLLS I IGYNTTSTVPKEGQSVQWWHAQGI IGLI LFLLCV 
FYSSIRTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYS YS FFHFMLFLASLY IMMTLTNWYRYEPSREM 
KSQWTAVWVKISSS W IGI VLYVWTLVAPLVLTNRDFD 


6207 
* 


2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAAS PT PI PTVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SSSLSSIVGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHI KSCIEA 
HEKDMELSFAVQRSKDMVCGICMEWYEKANPSERRFGILSNCN 
HTYCLKCI RKWRSAKQFESKI I KS CPECRITSNFVI PSEYWVBE 
KEEKQKL I LKYKEAMSNKACRYFDEGRGS CP FGGNCFYKHAYPD 
GRREEPQR QKVGTS S R YRAQRRNHFWELI EERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6208 

- • 


2924 




1471 

♦ 


TVMAEAATPGTTATTSGAGAAAATAAAAS PTP I PTVTAP S LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SVVCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SS SLS S IVGPLVEMNTGEAESRNSNFATVGAGSEDWVNAI EFVP 
GQ P YCiGRTAP S c5rEAPLQGSVT 

VGECRYGENCVYLHGDS CDMCGLQVLHPMDAAQRSQHIKS CIEA 
HE KDMELS FAVQRS KDMVCGI CME WYEKANPS ERRFG I LSNCN 
HT YCLKC I RKWRSAKQFESKI I KS CPECRITSNFVI PSEYWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ERLCFPCMQS KI YSYMS PNKCSGMRFPLQEENSVTHHEVKCQGK 
PLAG I YRKREE KRNAGNAVRS AMKSEEQKI KDARKGPLVPFPNQ 
KSEAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
KIDL IDGKGRG VIATKQFSRGDFWEYHGDLI E ITDAKKREALY 
AQDP STGCYM YYFQYLS KTYCVDATRETNRLGRLINHS KCGNCQ 
TKLHD I DGVPHL I L IASRD I AAGE ELLYDYGD RS KAS I EAHPWL 
KH 


6210 

* • 


3761 

a * 


387 

• 


I FGMS KLRMVLLEDSGS ADFRRHFVNLS PFT I T WLLLS AC FVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 

S VI CNQLGCP TAI KAPGWANS SAGSGRI WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVE I QRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTSYQVYSKIQATNTWL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleohide 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
| sequence 


Amino acid segment containing signal peptide 
\/\=nianiuc, ^ov-ysceane, u=Aspartiic AC la , £•= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine f R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptpphan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




* 

• 




FLSSCNGNETSLWDCKNWQWGGliTCXJHYEEAKITCSAHREPRLV 
GGDI PCSGRVEVKHGDTWGS I CDSDFSLEAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCSRYTE I RLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQQLKCGVaLSTPGGAR FGKGNGQI WRKMFHCTGTEQ 
HMGDCPVTALGAS LCPSEQVAS VI CSGNQSQTLSSCNS S SLG PT 
RPTI PEESAVACI ESGQLRLVNGGGRCAGRVE I YHEGSWGTI CD 
LJ b W Ui»b DAH WCKQ LGCGEAI NATGSAHFGEGTGPIWLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRIiASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VGILGVVLLAI FVALFFLTKKRRQRQRLAVS S RGENLVHQIQ YR 
EMNSCLNADDLDLMNSSGGHSEPH 


6211 

* 


3761 

i 

«**'•• - . m 1 


387 

- * 


I FGMS KLRMVLLED SGS ADFRRH F VNLS P FT I T WLLLS ACF VT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
I KFQGRWGTVCDDNFNIDHASVI CRQLECGS AVS FSGS SNFGEG 
SGPIWFDDLICNGNESAIjWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLE LRLRGGGSRCAGTVBVE I QRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTS YQVYS KIQATNTWL 
FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKI TCSAHRE PRLV 
GGDIPCSGRVEVKHGDTWGSICDSDFSLEAASVLCRELQCGTVV 
S I LGGAHFGEGNGQIWAEEFQCEGHE SHLSLC PVAPRPEGTCSH 
SRDVGWCSRYTEIRLTOGKTPCEGRVELKTLGAWGSLCNSHWD 
IEDAHVLCQQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTI PEES AVAC I ESGQLRLVNGGGRCAGRVE I YHEGSWGT I CD 
U<3 WU-UbDAH v V L-KyLAjUjciAJ. W A Hj5 AnbXsbCj IGP I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVI CSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGVVCRQLGCADKGKINP t 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLiASPSEET 
WI TCDNKIRLQEGPTS CSGRVE I WHGGS WGTVCDDS WDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTD IS VQKTPQKATTGRSSRQSS FIA 
VGILGWLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 | 


1134 


LKWBLRPGG AVWGTGPGAGTR A PR <3 C rTYWMPn P P Q Q T .P P a T?D "D 

RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNC^PEDVKNFYLMTNGFHMTWSVKLDEHIIPLGSMAINSI 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
VI FELDS CNGSGKVCLVYKSGKP ALAEDTE I WFLDRAL YWHFLT 
DTFTAYYRLLITHLGLPQWQYAFTSYGISPQAKQRVSMYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


l ! 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNF YLMTNGFHMTWSVKLDEHI I PLGS MAINS I 
SKLTOLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresporiuing 
to first 
amino acid 
residue of 
amino acia 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to tirsc 
amino acid 
residue of 
amino acid 
sequence 


Amino acict segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, K=Lysine, 
Li— jueucine , n-rieuxiionine, r*=/isparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
LOQun, /spossiDic nucicOuiue ueiecion, 
\=possible nucleotide insertion) 








VIFELDSCNGSGKVGLVYKSGKPAliAEDTEIWFLDRALYWHFLT 
DTFTAYYRLIi I THLGLPQWQYAFTS YGI SPQAKQRVSMYKP IT Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HELAPSAIRRAARLGLGPARWQSRAAAFYFVRGFRTGWSFVGWV 
VU5TSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYK 
YRTYAVRRIRDAFRENKNVKDPVEIQTLVNKAKRDLiGVIRRQVH 
IGQLiYSTDKLI I ENRDMPRT 


6215 

■ 


2 


1849 


FVAGG PRG S GS AAETMP E I RVT PLGAGQDVGRS C I LVS I AG KNV 
MLDCGpIHMG FNDDRRFPD FS Y I TQNGRLTD FLD CVI I SHFHLDH 
CGALP YFSEMVGYDGP I YMTHPTQAI CP ILLEDYRKIAVDKKGE 
ANFFTSQM I KDCMKKWAVHLHQTVQVDDELE I KAYYAGHVLGA 
AMFQIKVGSESVVYTGDYNMTPDRH TEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVLI PVFALGRAQELC 
ILLETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRKT 
FVQRNMrEF KHI KAFDKAr ADWPGPM W r JYxPGnltHAGQbltQl r 
RKWAGNEKNMVI M PGYCVQGTVGHKIliSGQRKLEMEGRQVLE VK 
MQVE YMSFSAHADAKGIMQLVGQAE PES VLLVHGEAKKME FLKQ 
KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLLKREMAQGL 
LPEAKKPRLIiHGTLIMKDSNFRLVS SEQALKELGLAEHQLRFTC 
RTOLHDTRKEQETALRWSHLKSVLKDHCVQHLPDGSVTVESVL 
LQAAAPS EDPGTKVLLVS WTYQDEELGS FLTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPEPRNSALRQSRSKMAVVGVSSVSRLLGRSRPQLGRPMSS 
GAHGEEGSARMWKTLTFFVALPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHLRIRTKP FPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 

IJiKLFIGGLSFETTDDSLREHFEKWGTIiTDCVVMRDPQTKRSRG 

FGFVTYSCVEBVDAAMCARPHKVDGRVVEPKRAV 

HLTVKKI FVGG I KEDTEE YNLRDYFEKYGKIETIE VMEDRQSGK 

KRGFAFVTFDDHDTVDKIWQKYHTINGHNCEVKKALSKQEMQS 

AGSQRGRGGGSGN^GRGGNFGGGGGN , 

GGSRGS YGGGDGG YNGBGGDGGNYGGG PG YS SRGG YGGGGPG YG 

NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 

NYGPMKGGSFGGRSfeGSPYGGGYGSGGGSGGYGSRRF 


6Z1 B 


1305 


o rt «r 

906 


SCERRGr IMADI/JjJKJKrijYKiClaPoVJb VVfal/KiAsVFVIKVA 
NDNAPEHALRPGFLSTFALATDQGSKLGLSKNKS 1 1 CYYNTYQV 
VQFNRLP L WS F IAS S S ANTGL I VSLE KELAPLFEELRQWE VS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMAS AGGEDCES PAPEADRPHQR P FL 
I G VSGGTAS GKS TVCEKIMELLGQNEVEQRQRKWI LS QDR F YK 
xrr T7AT7ri vta V7\T v , r , riVKTirriUDrv&T?niv7riT.MWD , rr iTKTT'X7T?n'tf T r'\n?\7D 

V Li 1 Any AAlvAJuiVu v * Xn r Url ir UAr OlMLiJ-ilUclK 1 J_*lviN 1 V iivjxV X v JtL V r* 

T YDF VTHS RLPE TTWYPAD WL FEG I LVF YSQE I RDM FHLRL F 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTTFVKPAFEEFCLP 
TKKYADVI I PRGVDNMVAINLIVQHIQDILNGDICKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EQNISLEMSCTIEKAIiADAKALVERIiRDHDDAAESLIEQTTALN 
KRVEANKQ YQEE I QE LNE VARHRPRSTLVMG I QQENRQ I RELQQ 
ENICE1.RT T.FFHOS ALELIMS KYREOMFRLLMAS KKDDPG I IMK 
LKEQHSKIDMVHRNKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RW I WDLN P VS DGLE LRP KYKGI LHCLTT I WKLDGLRGL YQGVT P 
NI WGAGLS WGLYFVFYNAI KS YKTEGRAERLEATEYLVSAAEAG 
AMTLCITNPLWVTKTRLMLQYDAWNSPHRQYKGMFDTLVKIYK 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKLKYNQH INRLP E 
AQLSTVEY I SVAALSKI FAVAATYP YQWRARLQDQHMFYSGVI 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine / 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6222 

• 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKPILCPRRTTAQLG 
PRRNPAWSLQAGRLFSTQTAEDKEEPLHS I ISSTESVQGSTSKH 
EFQAETRKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLV 
SDGQALPEMEIHLQTNAEKGTITIQDTGIG^4TQEELVSNLGTIA 
RSGS KAFLDALQNQAEASS KI IGQFGVGFYSAFMVADRVEVYSR 
SAAPGS LGYQWLS DGSGVFE I AEASG VRTGTKI I IHLKSDCKEF 
SSEARVRDVVTKYSNFVSFPIiYLNGRRMNTLQAIWMMDPKDVRE 
WQHEEFYR YVAQAHDKPR YTLHYKTDAPLWIRS I FYVPDMKPSM 
FDVSRELGSSVALYSRKVIilQTKATDILPKWLRFIRGWDSEDI 
PLNLSRELLQESALIRKLRDVLQQRLIKFFIDQSKKDAEKYAKF 
FEDYGLFMREGIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
EYASRMRAGTRNIYYLCAPNRHLAEHSPYYEAMKKKDTEVLFCF 
EQFDELTLLHLRE FDKXKL I S VETDIWDHYKEEKFEDRS PAAE 
CLSEKETEELMAWMRNVLGSRVTNVKVTLRLDra 
AARHFLRMQQLAKTQEERAQLLQPTLE I NPRHAL I KKLNQLRAS 
E PGLAQLLVDQI YENAM IAAGLVDDPRAMVGRLNELLVKALERH 


6223 


3 


715 


DAWARTMAGMVDFQDEEQVKS FLENME VECNYHCYHEKD PDGCY 
RLVDYLEGIRKNFDEAAKVLKFNCEENQHSDSCYKLGAYYVTGK \ 
GGLTQDLKAAARC FLMACEKPGKKS I AACHNVGLIAHDGQVNBD 
GQPDLGKARD YYTRACDGGYTS SCFNLSAMFLQGAPGFPKDMDL 
ACKYSMKACDLGH I WACANASRMYKLGIX3VDKVEAKAEVLKNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTISSMAWGPLLLTLLAHCTGSWAQSVLTQPPSVSGT^RIPHEK 


6225 

• 


3259 

• 

• 


938 

» . 

■ 


LLS CHRLAI CKLPFS VESRKTVMGPQGARRQAFLAFGD VTVDFT 
QKEWRLLSPAQRALYREVTLENYSHLVSLGILHSKPBIiIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTE IDKVLKGI ENSRWGAFKCAERGQDFSRKMMVTIH 
KKAHSRQKLFTCRECHQGFRDESALIitjHQNTHTGEKSYVCSyCG 
RGFSLKANLLRHQRTHSGEKPFLCKVCX3RGYTSKSYLTVHERTH 
TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKE03RGYT 
NKSYJV^^IHSGEKPYRC^ 

PFACRQCKQSFSVKGSLLRHQRTHSGEKPFVCKDCERSFSQKST 
LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQITHSEEKPFVC 
KDCGRGFIQKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 
LRAHLGEKRFFCRDCGRGFTLKPNLT IHQRTHSGEKPFMCKQCE 
KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTLLFHQKTH 
SGEKPFICSECGQGFIWKSNLVKHQLAHSGKQPFVCKECGRGFN 
WKGNLLTHQRTHSGEKP FVCNVCGQGFSWKRSLTRHHWRIHS KE 
KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 
YYSiCHLKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVSELLGGSQRLFFLPLWRRLCRCGLGPRVS PMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6227 

• 


2581 


890 


MSASSLLEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYTAMSDS YLPS YYS PS IGFSYSLGEAAWSTGGDTAMP YLTS 
YGQLSNGEPHFLPDAMFGQPGALGS TPFLGQHGFNFFPSG I DFS i 
AWGNNSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL j 
NKAPGMNTIDQGMAALKLGSTEVASNVPKVVGSAVGSGS ITSNI 
VASNSLPPATIAPPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PPPPI KHNMD IGTWDNKGP VAKAPSQALVQNIGQPTQGS PQPVG 
QQANNSP PVAQAS VGQQTQPLPPPP PQPAQLSVQQQAAQ PTRWV 
APRNRGSGPGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRS IN 
NYNPKDFDWNLKHGRVF 1 1 KSYSEDD I HRS I KYN I WCSTEHGNK 
RLDAAYRSMNGKG PVYLLFS VNGSGHFCGVAEMKSAVD YNTCAG 
WSQDKWKGRFDVRWIFVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IAS YKHTTSIFDDFSHYEKRQ 
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Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan , Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGAVMEIiAQEARBLGCWAVEEMGVPVAARAPESTLRRIi 
CLGQGADI WAYI LQHVHSQRTVKKIRGNLLWYGHQDS PQVRRKL 
ELEAAVTRLRAE IQEIiDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLQDMERKAKV 
DVTFGSLTSAALGLEPWLRDVRTACTLRAQFLQNLLLPQAKRG 
SLPTPHDDHFGTSYQQWLSSVETLLTNHPPGHVLAALEHLAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTLLKERQVLTQRLQGLVEEVERRVLGSSERQVL 
ILGLRRCCLWTELKAIiHDQSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQLVEETQEQVRLLIKGNSASKTRLCRSPGEVLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPSIHQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQALKRLEKLIiKQALERI PELQGIVGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALOKLCS 


6229 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAVDDLQFEEFG 
NAATSLTANPDATTVWIEDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKS S PFWTFE YYQTFFDVDTYQVFDR I KGSL 
LP I PGKNFVRLYIRSNPDLYGPFW I CATLVFAIAISGNLSNFL I 
HLGEKTYHYVPEFRKVS IAATI I YAYAWLVPLALWGFLMWRNS K 
VMNI VSYS FIjEI VCVYGYSLFI YI PTAILWI I PHKAVRWILVMI 
ALG I SGS LLAMTFW P AVREENRRVALAT I VT I VXiLHMLLS VG CL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 


i 6230 

• *■ - " 


1723 


600 


S KMSGRSG KKKMS KLS RS ARAG V I F P VGRLMR YLKKGT FKYR I S 
VGAPVYMAAV I E YLAAE I LELAGNAARDNKKAR I APRH I LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAEIDLKEDIGKALEKAGGKEFLETVKELRKSQGPIjEVAEAAV 
SQSSGLAAKFVIHCHiPQWGSDKCEEQLEETIKlTCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSESIGIYVQEMAKLDAK 


6231 


149 


870 


LI FS S S TMD RS LRNVL WS FGFLLL FTAYGGLQ S LQS S L YS E EG 
LGVTALSTLYGGMLLSSMFLPPLL IERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTSILLGLGAAPLWSAQCTYLTITGNTHAEK 
AGKRGKDMVNQYFGIFFLIFQSSGVWGNLISSLVFGQTPSQETL 
PEEQLTSCXxASDCLMATTTTNSTQRPSQQLVYTLLGIYTGSGVL 
AVLMIAAFLQP IRDVQRESE 


6232 

• 

• 


3679 


1476 


FVAGTTMAG FWVGTAPLVAAGRRGRW P PQQLMLS AALRTLKHVL 
YYSRQCLMVSRNLGSVGYDPNEKTFDKILVANRGEIACRVIRTC 
KTOGIKTVAIHSDVDASSVHVKMADEAVCVGPAPTSKSYIjNMDA 
IMEAI KKTRAQAVHPGYGFLSENKEFARCLAAEDWFIGPDTHA 
I QAMGDKI ES KLLAKKAEVNT I PG FDGWKDAEEAVR IARE I G Y 
PVM I KASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKFIDNPRHIEIQVLGDKHGNALWLNERECSIQRRNQKVVEE 
APS I FLDAETRRAMGEQAVALARAVKYSSAGTVEFLVDS KKNFY 
FLEMNTRLQVEHPVTEC I TGLDLVQEM I RVAKGYPLRHKQAD IR 
INGWAVECRVYAEDPYKSFGLPSIGRLSQYQEPLHLPGVRVDSG 
IQPGSDISIYYDPMISKLITYGSDRTEALKRMADALDNYVIRGV 
THN IALLRE VI INSRFVKGDI STKFLSDVYPDGFKGHMLTKSEK 
NQLLAIASSLFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHTWASl^GSVFSVEVDGSKIiNVTSTWNLASPLLSVSVDGT 
QRTVQCLSREAGGNMSIQFLGTVYKVNILTRLAAELNKFMLEKV 
TEDTS S VLRS PMPGWVAVS VKPGDAVAEGQEICVI EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 


1 


2654 


H S TRENLNAGNFNF P S EGHLVRS TG P GG S FAKHMVAQCVS P KG P 
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Amino acid secrment containing si anal Deo tide 
(A^Alanine, C= Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Prol ine , Q=Glutaraine , R=Arginine , 
S=Serine, T= Threonine, V= Valine, 
W =Tryp t ophan , Y= Tyros ine, X=tJnknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




* 




IiACSRTYFFGATHVPYLGGDS KIiPKKTEQ IRLLSQI YAAVI EAV 
LAGIACYAKTSSLTKAKEVAEQTLGSGLDSFELI PFKAALRSKM 
TFHIKAVNNQGR IVPLDSEDSLS FVKTACMAVYD I PDLLGGNGC 
LGS WFS E S FLTS Q I LVKE KDGT VTTETS S WLTAAVPRFCS WL 
VEDNEVKLSEKTHQAVRGDE S FLGTYLTGGEGAYLYS SNLQS WP 
EEGNVHFFSSGLLFSHCRHGSIIISKDHMNSISFYDGDSTSTVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSBVF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFIiQHFAISSISQEPVMRTHLPVLLQQ 
AEIOTTHRXESDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 
MVYRGIMDSSECFHAAHFQRYLSSALEA0ONRSAR0SAYIRKKT 
RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPLLVQLQSL 
I RAANPAAAF1LAENG IVTRNED I ELI LS ENSFSS PEMLRSRYL 
MYPGW YEGKLNAGS VY PLMVQ I CVWFGRPLEKTRFVAKCKAI QS 
SIKPSPFSGNIYHILGKVKFSDSERTMEVCYOTLANSLSIMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQE IRS IHVKRHLEPLPAGYFYNGTQFV 
NFFGDKTDFHPLMDQFMNDYVEEANREIEKYNQELEQQEYHDLF 
ELKP 


6234 


1731 


404 

■ 

* • ■ - 


PR VREDMDHKS PGNKG S LVYAG I KS I VKS SLGMVE S SRHNWS GIi 
DKOS D I ONLNEERI IiALOL CG W I KKGTDVDVGPFTjNSL VOEG EW 
ERAAAVALFNLDIRRAIQILNEGASSEKGDlj^LNVVAMALSGYT 
DE KNSLWREMCS TLRLQLNNP YLCVMFAFLTS ETGS YDGVLYEN 
KVAVRDRVAFACKFLS DTQLNR Y I EKLTNEMKEAGNLEG I t»LTG 
IiTKDGVDLME S YVDRTGDVQTAS YCP4LQGS PLDVLKDERVQ YW I 
BNYRNLLDAWRFWHKRAEFT)IHRSKIJ)PSSKPLAOVFVSCNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
'CRHGGHAGHMLSWFRDHAECPVSACTCKCWQLDTTGNLVPAETV 
QP 


6235 


1* 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPEliAAGGIMATRNPPPQ' ~ 
D YE SDDDS YE VLDLTE Y ARRHQWWNRVFGHS SGPMVE KYS VATQ 
- 1 VMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQIASHSGYVQI 
DWKRVEKDVNKAKRQIKKRANKAAPEINNLIEEATEFIKQNIVI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKBLSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARrHAE 
NAIRQKNQAVNFIiRMSARVDAVAARVQTAVTMGKVTKSMAGVVK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNI4ELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEG IAAGGVMDVNTALQEVIiKTALIHDGLARG IREAAKA 
LDKRQAHLCTLASNCDEPMYVKLVEALCAEHQINLIKVDDNKKIj 
GEWVGLCKI DREGKPRKWGCS CVWKDYGKESQAKDVI EE YFK 
CKK 


6238 


2 


4666 


EEVPTQESVKWEINVIIKNPEIVFVADMTKNDAPALVITTQCEI 
CYKGNLENSTMTAAIKDLOVRACPFLPVKRKGKITTVLQPCDLF 
YQTTQKGTDPQV IDMS VKSLTLKVS PVI INTMITI TSALYTTKE 
TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMIKMNIDSIFIVLEAGIGHRTVPMLLAKSRFSGEGKWWSSL 
INLHCQLELEVH Y YNEMFGVWE PLL E PLE I DQTED FR P WNLG I K 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
LVMLNNLVKAFTEAATGS S ADFVKDLAPFMI LNSLGLTI S VS PS 
DSFSVLNIPMAKSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKI PLTKVGRRLYTVRHRESGVERS I VCQI 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hist idine , 1=1 soleuc ine , K=Lys ine , 
Ls Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






» 


DTVEGS KKVT I RS P VQ I RNHFS VPLS VYEGDTLLGTAS PENEFN 
IPLGSYRSFIFLKPEDENYQMCEGIDFBEIIKNDGALLKKKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYKIAYYIEGIENSVPTLSEGHSAQICTAQl^KARLHLKL 
LD YLNHD WKS E YH I KPNQQD I S FVS FTCVTEMEKTDLD I AVHMT 
YNTGQTWAFHS PYWMVNKTGRMLQYKADGIHRKHP PNYKKPVL 
FS FQ PNHFFNNNKVQLMVTDS ELSNQFS I DTVGSHGAVKCKGLK 
MD YQ VGVT I DLS S FN I TR I VTFTPF YM I KNKS KYH I S VAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYFNKQENC 
I LLRLDNELGGI I AE VNLAEHSTVT TFLD YHDGAAT FLL INHT K 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMP IDLGE KTI YLVS FFEGLQRI ILFTEDPRVFK 
VTYESEKAELAEQE IAVALQDVGI SLVNNYTKQEVAYIG ITSSD 
WWETKPKKKARWKPMSVKHTEKLEREFKEYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKI LQPHVIALRRNYLPALKVEYNTSAHQS 
SFRIQIYRIQIQNQIHGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
VS I VMRSAGHSQISR IKYFKVLIQEMDLRIiDijGFI XALTDLWIE 
AE VTENTE VEL FHKD I EAFKEE YKTASLVDQS QVS LYE YFHIS P 
I KLHLSVS LS SGREEAKDS KQNGGLI PVHSLNLLLKS IGATLTD 
VQDWFKLAFFEIJ^QFHTTSDLQSEVIRHYSKQAIKQMYVLIL 
GLDVLGNP FGLIREFSEGVEAFFYEP YQGAI QGPEEFVEGMALG 
LKALVGGAVGG LAGAAS K I TGAMAKGVAAMTMDED YQQKRREAM 
NKQPAGFREG ITRGGKGLVSGFVSGI TGI VTKPI KGAQKGGAAG 
FFKGVGKGLVGAVARPTGGI IDMASSTFQGI KRATETSEVESLR 
PPRFFNEDGVIRPYRLRDGTGNQMLQKIQFYREWIMTHSSSSDD 
DDDDDDDDE SDLNH 


6239 


2108 

• 


634 


KPGMAGKGS SGRRPLLLGLLVAVATVHLVI CP YTKVEBSFNLQA 
THDLLYHWQDLEQYDHLEFPGWPRTFLGPVVIAVFSSPAVYVL 
SLLEMSKFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM 
FCWVTAMQ FHLM F YCTRTLPNVLALP VVLLALAAWLRHE WARFI " 
WLSAFAI I VFRVELCLFLGLLLLLALGNRKVSVVRAXRHAVPAG 
IECIjGLTVAVBSYFWRQOTWPEGK^ 

^fysalprgi^csllfiplglvdrrthaptvlalgfmalysll 
phkelrfiiyafpmlnitaargcsyllnnykkswlykagsllvi 
ghlwnaaysatalyvshfnypggvamqrlhqlvppqtdvllhi 
dvaaaqtgvsrflqvnsawrydkredvqpgtgmlaythilmeaa 

ERLPRPS 


6240 


2202 

• 


1176 


HERGDSLKEPTS IAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGFELGQLQSIRSEGTTSTSYKSIANQ 
TRNGS LS YDS LLTPSDS PDFESVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
AS PG PGQPPLSS PTRGGVKKVSGVGGTTYE IS V 


• 






RN AFF KKRT ,S LOREKI IARVS I DNRTRALVQAliRRTTDPKLC I T 
RVEELTFHLLEFPEGKGVAVKERI I PYLLRLRQIKDETLQAAVR 
EILALIGYVDPVKGRGIRILSIDGGGTRGWAIiOTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVIVGTVKMSWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQAIRAS S AAPGYFAEYALGNDLHQDGGLLIiNNPSAIiAMHECKC 
LWPDVPLECIVSLGTGRYESDVRNTVTYTSLKTKLSNVINSATD 
T EE VH I ML DGLL P PDTYFRFNP VMCENI PLDESRNEKLDQLQLE 
GLKYIERNEQKMKKVAKIIiSQEKTTLQKINDWIKLKTDMYEGLP 
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P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFSKL 


6242 

• 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
SSEDIDQMFSTLLGEMDLLTQSLGVDTLPPPDPNPPRAEFNYSV 
GFKDLNESIiNALEDQDLDAIjMADLVADISEAEQRTIQAQKESLQ 

nqhhsaslqas i fsgaaslgygtnvaatgisqyeddlppppadp 
vijdlplpppppeplsqeeeeaqakadkiklaleklkeakvkklv 
vkvhmndnstkslmvderqlardvldnlfekthcdcnvdwclye 
iypelqierffedhenwevlsdwtrdtenkilflekeekyavf 
knpqnfyldnrgkkesketnekmnaknkesiilevrlilqsgrkb 
kdvcs i fks fasenngki 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RAS S RRLACGPQTRAGAETRSTAMIRANS AARDTRRATCRSA 
AGT PS PTTMTCLTDVPTG CAAVE PTARLPAAAWA3 TI TTG C CPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTGPPAH 
SPTPGS IDPS PELSWGSAGVTQES PLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPIiK 
GPPILAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKliQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDliNDNICKRYIKMITNIVILSLI 
ICISLAFWIISMTASTYYGNLRPISPWRWLFSWVPVLIVSNGL 
KKKSLDHSGALGGLWGFILTIANFSFFTSLLMFFLSSSKLTKW 
KGE VKKRLDS E YKEGGQRNWVQVFCNGAVPTE3UALLYM I ENGPG 
E I PVDFS KQYSASWMCLSLLAALACSAGDTWASE VGPVLS KSS P 
RLITTWEKVPVGTNGGVTWGLVSS LLGGTFVGIAYFLTQLI FV 
NDLDISAPQWPI IAFGGLAGLLGSIVDSYLGATMQYTGLDESTG 
MWNS PTNKARH IAGKP I LDNNAVNL FS S VL IALLL PTAAWG FW 
PRG 


6246 
** 


1177 


359 

• 

♦ 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSIiCWSSCGQHPV 
QATHRGAVSNSiiMLCILKLASQMPLENTWQQWFMLLSNliALS 
HDCKGVIQKSNFLQNFLSL^PKGGNKMLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKPKILANEKVITVLAACLESENQNAQRIGAAALWALIYNYQ 
KAKTALKS PS VXRRVDEAYSIiAKKTFPNSEANPLNAY YLKCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRVWG P WTE PS AGS LRPMAR KQNRNS KELGLVPLTDDTSHAG P 
PGPGRALLECDHLRSGVPGGRRRKDWS CSLLVASIiAGAFGS S FL 
YGYNLSWNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
S I FAIGGLVGTIj I VKM IG KVLGRKHTLLANNG FAI SAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSE I SPKEIRGSLG 
QVTAI FI CIGVFTGQLLGLPELLGKESTWP YLFGVI WPAWQI/ 
LSLPFLPDSPRYLLLEKHNEARAVKAFQTFLGKAHVSQEVEEVL 
AESRVQRS IRLVS VLELLRAPYVRWQVVTVIVTMACYQLCGLNA 
I WFYTNS I FGKAG I P PAKI P YVTLSTGG I ETLAAVFSGLVI EHL 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
IASFCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLSNFAVG 
LLFPFIQKS LDT YCFL VFATI CI TGAI YL YFVLPETKNRTYAEI 
SQAFS KRWKAYP PE EKI DSAVTDGKINGRP 


6248 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRI PKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS IQDIjFELFSSNENQPLTTKVCWP 
SQPVVEIiVLMKVLGACKLLIiRLLDCCCKTFLLTVKHLGIjQEFII 
Ll^VMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQP yfeafkkkmp I afaakg inkllnklf 

LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
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Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine r R=Arginine, 
S= Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VI GTPHAKSFVQRFREAES FTQLS EE IQMAWWCRS KKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECIKTSICNHLLRGSGIK 
TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKVmLSHCTVHRTDLYPNSKQLLNSGVSMPVIQTKEKMI 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFALMGV 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFI I 
LNLVMVGLVSRLWVLYKGVLKRL I LLYEPLFGLLQEVAR I QPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEETLLGI SKKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VI GTPHAKSFVQRFREAESFTQLSEE IQMAWWCRS KKLKAQAI 
FLGNKLLKSNRLKHLEAQGTS LP KKLECI KTS I CNHLLRGSG I K 
TS KHHLRQRRSQNKFLRRQRKPQRKLQSTLLRE I QQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVI QTKEKM I 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFALMGV 


6250 


232 


1306 


LAALHIMALPFRKDLEK^KDLDEDELLGNLSETELKQLETVLDD 
LDPENALLPAGFRQKNQTSKSTTGPFDREHLLSYLEKEALEHKD 
REDYVPYTGEKKGKIFIPKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAILGMHCnjITNTKFCNIMGSSNGVDQEHFSNVVKG 
E KI LPVFDE P PNPTNVEE SLKRTKENDAHLVEVNLNNI KN I P I P 
TLKDFAKALETNTHVKCFSLAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNF ITGVGILALIDALRDNETLAELKI DNQRQQLGTAVE 
LBMAKMLEENTNILKFG YQFTQQGPRTRAANAI TKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 

* 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWS KVRHI KGPKDVERSRI FS 
KLCLN I RLAVKEGGFNP EHNSNLAN I LE VCRS KHMP KS T I ETAL 
KMEKS KDTYLL^EGRGPGGS SLL I EALSNSSHKCQADIRHI LNK 
NGGVMAVGARHS FDKKGVI WEVEDREKKAVNLERALEMAIEAG 
AEDVKETEDEEERNVFKFICDASSLHQVRKKLDSLGLCSVSCAL 
EFIPNSKVQLAEPDLEQAAHLIQALSNHEDVIHVYDNIB 


6252 

- 


27 


1897 


EEFCTOIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEEPGFLEGE 
DGEDTAKI CQAD I VEAVD IASAAKHFDLNLRQFGP YRLNYSRTG 
RHLAFGGRRGHVAALDOTTKKLMCEINVMEAVRDIRFLHSEALL 
AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TVSLWS PAMKE PLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
IjDPKAloAaViJVI&LEQGKjUiyioKJjOiU UiritfAyAbK 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
P KKPRE WKNPE SQRGLSGAQDP FPGPAPVPVE WQKFCRI DKSR 
KLPHSKAKTRSRLEVAEABEEETSIKAARSELLLAEEPGFLEGE 
DGEDTAKI CQAD I VEAVD IASAAKHFDLNLRQFGP YRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRDIRFLHSEALL 
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SEQ 
ID 
NO: 


Predicted 
becrinnincr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
\-ri s3 Aj,ciniiie , u-Lyotciuc f u=/isparc.ic aciq, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S= Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








iWAyiNKWljttliUJNy^l&Ijnv^ bPrHFIiiiATASE 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TV£»LWS PAMKEPLAKI LCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGTYQPLS TRTLPHGAGHLAFSQRGLLVAGMGD WNI WA 
GQGKAS P P SLEQP YLTHRLSGPVHGLQFCP FEDVLGVGHTGG I T 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 


6254 


155 


1 1139 

« 


HAIiGRRGGSQELSAAACGCFALRLRAPGSGRPALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMIjLKE YRI CMPLTVDE YKIGQLYMI SKH 
SHEQSDRGEGVEWQNEPFEDPHHGNGQFTEKRVYLNSKLPSWA 
RAWPKI F YVTEKAWNYYPYT I TE YTCS FLPKFS IH IETKYEDN 
KGSNDT I FDNEAKDVEREVCF I D I ACDE I PERY YKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
HKVVRDILLIGHRQAFAWVDEWYDMTMDDVREYEKNMHEQTNI K 
VCNQHSSPVDDIESHAQTST 


6255 


1 


1444 

» 


PTRPQQELLVSLATVI FVASQKALS VESKAVI KQQLESVSNGWT 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAAUUljl(jJUU£ciNiSSAfjSUlA£SLKFY 

LSFQCEFVKLRIDLLQAFSQLICTCNSLKTSPPPAIATTIAMTL 
GNDLQRCGRISNQMKQSMEEFRSLASRYGDLYQASFDADSATLR 
NVEIiQQQSCLLISHAIEALIIJDPESASFQEYGSTGTAHADSEYE 
RRMMS VYNHVLEEVES LNGKYTPVS YMHTACLCNAI IALLKVPL 
SFQRYFFQKLQSTS IKLALS PSPRNPAEPIAVQNNQQLALKVEG 
WQHGSKPGLFRKIQSVCLNVSSTLQSKSGQDYKIPIDNMTNEM 
EQRVEPHNDYFSTQFLLNFAILGTHN I TVESSVKDANG I VWKTG 
PRTTI FVKSLEDPYSQQ IRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 

• 

m 
* 


1542 

- 


CRGAGAEPAANPRSPRSLVPSLES TS TSVPPAPGTMATDS WAIiA 
VDEQEAAAESLSNLHLKEEKIKPDTNGAVVKTNANAEKTDEEEK 
EDRAAQS LLNKLIf^^VDNTC^VEVLQRDPNS PLYSVKS FEEL 
KIiKPQIjLiUG VYAMGFNRP S KI QENAL PLMLAE P PQNLIAQSQSG 
TGKTAAFVLAMLSQVE PANKYPQCLCLS PTYELALQTGKVIEQM . 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 

onl S VWJVTAyj\V VcUfvi V J.iNJjxVtvct£t£t 1 LiVX X ISSJ I I VJjL.ooK 

DEKFQALCNLYGAITIAQAM I FCHTRKTASWLAAELS KEGHQVA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTNVCARG I DVEQVSW 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDS KHSM 
NILNRIOEHFNKKIERLDTDDLDEIEKIAN 


6257 


210 


615 


AFI PAMAELIQKKLQGEVE KYQQLQKDLSKSMSGRQKLEAQLTE 
NNIVKEELALLDGSlWVFlOiLGPVLVKQEI^EARATVGKRLDYI 
TAEIKR YESQLRDLERQS EQQRETLAQLQQE FQRAQAAKAGAPG 
KA 


6258 


210 


615 


AFIPAIVIAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
lWIVKEELAIiLDGSNWFKLLGPVLVKQELGEARATVGKRI^ 

TAEIKRYESQLRDLERQSBQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 

• 


1540 


ILEKGFPSQOTPERKWKVDDVTjESSQENEDDHFWELLFHNNKTV 
S VENGDRGSKTFNLGTDPVSLRNYP YKI CDS CEMNLKNTSGL 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPSFGQSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTFI ESLKLNI SQRPHLEMEP YGCS I CGKS FCMNLRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIHQGAYTRKIIiREYKVSD 
KTWEKSALLKHQ I VHMGGKS YDYNENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 



475 



WO 01/53312 



PCT7TJS00/34263 



SEQ 
ID 
WO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=*Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKQCGKTFCVKSNLTEHQRTHTGEKP 
YE CNACGKS F CHRS ALTVHQRTHTGE KP F I CNE CGKS FCVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRSVLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTGPE IHACAHAS ARAPGSRAMALREIiKVCLLGDTGVGKS S I W 
RFVEDSFDPNINPT I GAS FMTKTVQYQNELHKFL I WDTAGQERF 
RAliAPMYYRGSAAAI IVYDITKEETFSTLKNWVKELRQHGPPNI 
WAIAGNKCDLIDVRE VMERDAKDYADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
S PACE PCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQPGPWPG 
MAEVSIDQSKLPGVKEVCRDFAVLEDHTLAHSLQEQEIEHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QEIQEKLAIEAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
E I ARKLQEEELLATQVDMRAAQVAQDEE IARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKP KTAKAANS KSKESDE PHHS KNERPA 
RPPPPIMTDGEDADYTHFTNQQSSTRHFSKSESSHKGFHYKH 


6262 


2 


1759 


PE CHSQGLCS VHRPGKVPQARMSGL VLGQRDEPAGHRLSQEE I L 
GSTRLVSQGLEALRSEHQAVLQSLSQTIECLQQGGHEEGLVHEK 
ARQLRRSMENIELGLSEAQVMIALASHLSTVESEKQKLRAQVRR 

LCQENQWLRDELAGTQQRLQRSEQAVAQLEEEKKHLEFIiGQLRQ 
YDEDGHTSBEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGGYEIPARLRTLHKLVIQYAAQGRYEVAVPLCKQALEDLER 
TSGRGHPDVATMLNILALVYRDQNKYKEAAHLLNDALS IRESTL 
GPDHPAVAATLNNLAVLYGKRGKYKEAEPLCQRALEIREKVLGT 
NHPDVAKQLNNLALLCQNQGKYEAVERYYQRALAIYEGQLGPDN 
PNVARTKNNLAS CYLKQGKYAEAETL YKE I LTRAHVQE FGS VDD 
DHKP I WMHAEEREEMS KSRHHEGGTPYAEYGGWYKACKVSS PTV 
NTTLRNLGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGpSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


REIiDSLADLPERIKPPYANGEiSTSHLRSSSVEDVKLIISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKYYSADRNLIKNT 
APVNTVMDS PVHLEPS SQVGVTQNKSWEMPVDRLETLSTRDF I C 
PNSNIPDQESSLQS FCN S ENKVLKENADFLSLRQTELPGNSCAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQI ISSALDTDNESTKDTENTFVLGDVQKTDAFVPVYSDST 
IQEASPNFEKAYTIjPVLPSEKDFNGSDASTQLNTHYAFSKLTYK 
SSSGHEVENSTraTQVISHEKSNKLESLVTiTHLSRCDSDIiCEMN 
AGMPKGNLNEQDPKHCPESEKCLLS IEDEESQQS I LS SLENHSQ 
QSTQPEMHKYGQLVKVELEENAEDDKTENQIPQRMTRNKANTMA 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVSPSLLQAKEKTQQSLAAIVDSLKLDEIQPYSSER 
ANP YFEYLH I RKKI EEKRKLLCSV I PQAPQYYDEYVTFNGS YLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLPFSACTVLLD7\EVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQEXJNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPWWAIVGIjYRTGKSYLMNKLiAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSWIFTLAVLIj 

sstlvynsmgtinqqamdqlyyvtelthrirsksspdeneneds 
adfvs ffpdfvwtlrdfsldleadgqpltpdeyleys lkltqgt 
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SEQ 
xu 
NO: 


Predicted 
oecf innxncj 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S*Serine, T«Threonine , VWValine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKIiAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNSKTKTLSGGI KVNGPRLESLVLTY 
INAISRGDLPCMENAVLALAQIENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQE LLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQIi 
DKKRDDFCKQNQEASSDRCSALLQVIFSPLEEEVKAGIYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ ILTE KEKE I EVECVKAESAQASAKMVEEMQ IKYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 

* 


143 

- -• * 


1960 

- 


KHRQENNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPVVWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHP KKPEHTLVLLDTEGLGDVKKGDNQNDSW I FTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
SQKDKNFNLPRLC I RKFFPKKKCFVFDLP IHRRKLAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNSKTKTLSGGI KVNGPRLESLVLTY 
INAI SRGDLPCMENAVLALAQ I ENS AAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVT FS PLEEEVKAG I YS KPGG ' 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTEKEKEIEVECVKAESAQASAKMVEEMQIKYQQMMEEK . 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 

■ ■ .'j, 

+ 


276 

* 

♦ 

♦ * * ■ . _ 


1421 


GSHQKQMLVPCFLYS LQNRKPSLYGSLTCQG I GLDG I PE VTAS E 

GFTVNEINKKSIHISCPKENASSKFLAPYTTFSRIHTKSITCLD 

ISSRGGLGVSSSTDGTMKIWQASNGELRRVLEGHVFDVNCCRFF 

PSGLVVLSGGMDAQLKIWSAEDMCVVTFKGHKGGILDTAIVDR 

GRNWSASRDGTARLWDCGRSACLGVLADCGS S INGVAVGAADN 

S INLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 

LFIGSDAFNCCTFLSGFLI^GTQDGNIYQLDVRSPRAPVQVIH~ 

RSGAPVLSLLSVRDGFIASQGDGSCFIVQQDLDYVTELTGADCD 

PWKVATOEKQICTCCRDGLVimVQLSDL 


£267 

■ * 


3 


622 

• 


^3MMKKNNSAKRGPQIX3NQQPAPPEKVGWVRKFCGKGIFREIWK 
NRYWLKGDQLYISEKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RSKKNHSKFTLAHSKQPGNTAPNLIFLAVSPEEKESWINALNSA 
ITRAKNRILDEIVTVEEDSYLAHPTRDRAKIQHSRRPPTRGHLMA 
VAS TS TSDGMLTLDL IQEEDPS PEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLSSAL I DNPLTLLLS IDTYVMLQEPVTFQDVAV 
D F S RE E WG LLG P TQRTE YRD VMLET FGHLVS VG W ETTLENKELA 
PNSDI PEEE PAPSLKVQES SRDCALS STLEDTLQGGVQE VQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKI PPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFRNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
GERPFECQECGRTFNDRSAISQHLRTHTGAKPYKCQDOGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 

U Ct O J 






TQ YVKWTND KS LGGIEGCLS KLKAADPTF VMGHAMATGLVL I GT 
GSS VKLDKELDLAVKTMVE I SRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
I YPFWTPDI PLSSYVKGI YSFGLMETNFYDQAEKLAKBALS INP 
TDAWSVHTVAHIHEMKAE IKDGLEFMQHSETLWKDSDMLACHNY 
WHWALYLIEKGEYEAALTIYDTHILPSLQANDAMLDWDSCSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDASESPGENCQHLLARDVGLPLCQALVEAE 
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SEQ 
NO: 


Predicted 
nucleotide 

XUCdu lull 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucx eoc loe 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iA=»Aianine, C=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=sGlycine, 
n=nisciaine, i=iso±eucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, YsTyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 




* 




DGNPDRVLE LLLP I RYRI VQLGGSNAQRDVFNQLL I HAALNCTS 
SVHKNVARSLLMERDALKPNSPLTERLIRKAATVHLMQ 


6270 


23 


2086 


SVTVTLGSEGDGRPPTYHLEEMEQEPQNGEPAEIKIIREAYKKA" 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
GPGWES ARQMQQKMKETLQNVRTRLE I LE KGLATSLQNDLQE VP 
KLYPE FP P KDMCE KLPEPQS FSSAPQHAEVNGNTSTPSAGAVAA 
PAS LS LP SQSCPAEAPPAYTPQAAEGHYTVS YGTDSGE FSS VGE 
EFYRNHSQPPPLETLGLDADELILIPNGVQIFFVNPAGEVSAPS 
YPG YLRI VRFLDNSLDTVLNR PPGFLQVCDWLYPLVPDRS PVLK 
CTAGAYM FPDTMLQAAGCFVGWLS S ELPEDDRELFEDLLRQMS 
DLRLQANWNRAEEENEFQIPGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
SEKVAHN I LSGAS WVS WGLVKGAE I TGKAIQKGASKLRERIQPE 
*' EKPVEVSPAVTKGLYIAKQATGGAAKVSQFLVDGVCTVANCVGK 
ELAPHVKKHGSKLVPESLKKDKDGKSPLDGAMVVAASSVQGFST 
TOQGLECAAKCIVNlfV'SAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYN INNIG I KAMVKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAAlWNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


GCGVKTAGMVGREKELSIHFVPGSCRLVEEEVNIPNRRVIiVTGA 
TGLLGRAVHKE FQQNNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
I IHDFQPHVI VHCAAERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFL I Y I SSDYVFDGTNPP YRE EDIPAPLNLYGKTKLDGEK 
AVLENNLGAAVIjRIPILYGEVEKLEESAVTVMFDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVCRQLAEKRMIjDPS I KGTFHWSGNEQM 
TKYEMACAI ADAFNLPS SHLRP ITDS PVLGAQRPRNAQLDCSKL 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


1136 


528 


GAVMEDAAAPGRTEGVLERQGAP PAAGQGGALVELTPTPGGLAIi 
VS P YHTHRAGD PLDLVALAEQVQKADEFIRANATNKLTVIAEQI 
QHLQEQARKVLEDAHRDANIjHHVACNIVKKPGNIYYLYKRESGQ ( 
QYFSIISPKEWGTSCPHDFIiC^YKLQHDLSWtPYEDIEKQDAKI ' 
SMMDTLLSQSVALPPCTEPNFQGLTH * . . 


' 6273 


256 


843 

• 


SCPRVSPECRSLGCQVMFSLPLNCSPDHIRRGSGWGRPQDLKIA 
SAAWNSKCHPGAGAAMARQHARTLWYDRPRYVFMEFCVEDSTDV 
HVLIEDHRIVFSCKNADGVELYNEIEFYAKVNSKDSQDKRSSRS 
ITCFVRKWKEKVAWPRLTKEDIKPVWLSVDFDNWRDWEGDEEME 
LAHVEHYAE VRDNTYCVL PT 


con a 


3D 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCIiAGATtTGDCVGSFYEAHDT 

vdltsvlrhvqslepdpgtpgsertealVytddtamaralvqsl 
lakeafdevdmahrfaqeykkdpdrgygagvvtvfkkllnpkcr 
dvfeparaqfngkgsygnggamrvagislayssvqdvqkfarls 
aqlthas s lg yngai lqalavhlialqges s skhflkqllghmed 

UtiKjVjMJb V bJJAKTi JUbMnh KP x o bKJjKKI GEIjJLjJ^ASVTREEVVS 

elgngiaafesvptaiycflrcmepdpeipsafnslqrtliysi 
s lggdtdt i atmagaiagayygmdqvpes wqqs cegyeetd ila 
qslhrvfqks 


6275 


20 


565 


SRRGRARCLARGSRRPVPRPAKTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 

EEDTEEEEEKASVLGQLASLPGLiNLGSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPPIiPDTEGMILLNTGLEGTVAENPVPIVHTPSGNILTLE 
SCLQQIATHPGHWGIHLQIAEPAALRPSLaLLARLSSLGLLHWP 
VWVGAKI SHGSFS VPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSGYREQLLTDMLELCQGLWQPVSFQMQAMLLGHSTAGAIGRL 
liASS PRATVTVEHNPAGGD YAS VRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 



478 



WO 01/53312 



PCT/US00/34263 



SEQ Predicted 
ID beginning 
NO: nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6277 



4600 



Amino acid segment containing signal pepti£< 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=I»eucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, RaArginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



2744 



MAFRTE^LYYSYFKTIVEAPSFLNGVWMIt4NDKLTEYPLVINT 
LKRFNLYPEVILASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TESCEGLGDPACFYVAVIFILNGLMMALFFIYGTYLSGSRLGGL 
VTVLCFFTOHGECTRVMWTPPIiRESFSYPFLVLQMLLVTHILRA 
TKLYRGSLIALCI SNVFFMLPWQFAQFVLLTQ I ASLFAVYWGY 
IDICKLRKI I YIHMISLALCFVLMFGNSMLLTS YYASSLVI I WG 
ILAMKPHFLKINVSELSLWVXQGCFWLFGTVILKYLTSKIFGIA 
NDAHIGNLLTSKFFSYKDFDTLLYTCAAEFDFMEKETPLRYTKT 
LLLPWLVGFVAIVRKI ISDMWGVIiAKQQTHVRKHQFDHGELVY 
HALQLLAYTALGILIMRLKLFLTPHMCVMASLICSRQLFGWLFC 
KVHPGAIVTAIIiAAMSIQGSANLQTQWNIVGEFSWLPQEELIEW 
I KYSTKPDAVFAGAMPTMASVKLSALRP I VNHPHYEDAGLRART 
KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 

PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEW 
KE 



6278 



823 



ILFRLVLLSLVYLLNSVATEERKPAEVLIVEGQQYAWGTVLLL" 

IRI I LE YCQGVDNI PSVTTDMLTRLSDLLKYFNSRS CQLVLGAG 

ALQWGLKTITTKNLALSSRCLQLI VHYI PVIRAHFEARLPPKQ 

YSMLRHFDHITKDYHDHIAEISAKLVAIMDSLFDKLLSKYEVKA 

PVPSACFRNICKQMTKMHEAIFDLLPEEQTQMLFLRINASYKLH 

LKKQLSHLNVINDGGPQNGLVTADVAFYTGNLQALKGLKDLDLN 

MAEIWEQKR 



6279 



127 



1687 



GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTL 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 
AESLNSEVVMETANRVLRNHSQRQGRPTIjQEGPGLQQKPRPEAE 
P PS PPSGDLRLVKS VSESHTS CPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 
LSWSGI PKPVRPMTWKLLSGYLPANVDRRPATLQRKQKE YFAFI 
EHYYDSRNDEVHQOTYRQIHIDIP^PEALII^PKVTEIFERI . 
LF I WAIIUIPASG WQGINDLVTPFBVVFI CE YIEAEE VDTVDVS 
GVPAEVLCNIEADTYWCMSKLLDGIQDNYTFAQPGIQMKVKMLE 
ELVSRIDEQVHRHLDQHEVRYLQFATOWMimLU^REVPIiRCTIR 
LWDTYQSEPDGFSHFHLYVCAAFLVRWRKEILEEKDFQELLLFL 
QNIiPTAHWDDEDISLLLAEAYRLKFAFADAPNHYKK 



6280 



857 



2515 



6281 



857 



ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFI QALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELE FNE I KTQ VELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YS KDGQ I FMS ACQDQT I RL YDCRYGRFRKFKS I KA 
RDVGWS VLDVAFTPDGNHFLYSSWSDYIHI CN I YGEGDTHTALD 
LRPDERRFAVFS I AVS SDGREVLGGANDGCL YVFDREQNRRTLQ 
I B SHEDD VNAVAFAD I S S Q I L FSGGDDAI CKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIXLWDIRRFSSR 
EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVVVYDrjLSGHIVKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNIjRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 



2515 



ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNE I KTQ VELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YS KDGQ I FMSACQDQTI RL YDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNI YGEGDTHTALD 
LRPDERRFAVFS IAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
I ESHEDDVNAVAFADISSQ I LFSGGDDAI CKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 
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SEQ 
ID 
NO: 


Predicted 
b e O" i nn i. ncr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 

amino sri 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iA=Aianinei i~=*t-ysceane, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine / 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 

f-nuiiiic f \j-V3j. u. u etui-Lilt; ( K=/U.^1H1I16 1 

S=Serine, T=Threonine, V»Valine, 
W=»Tryptophan, Y»Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 

rtVTiUTT.TDPDT?CDTucTrir\ni7TVCPPC r PPUTnnnrnT t pptitttw 
VjVUilXJjJLivUrCroxr J.£lo Hayyi* J. icswUia xUKV v v xDlilid^HlVKK 

LTKHKACVRDVSWHPFEEKrVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHIBDAAVPGAQEALKRLRGASVIIR 
FVTNTTKESKQDLLERLRKLEFDISEDEIFTSLTAARSLLERKQ 
VRPMLLVDDRALPDFKG I QTSDPNAWMGLAPEHFHYQ I LNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTALEYATDTKAT 

GILVKTGKYRASDEEKINPPPYLTCESFPHAVDHILQHLL 


6283 


140 


1043 

■ 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPSIPKNAIiPITKPTSPAPAAQSTNGTHASYGPFYLEYSLL 
AEFTLWKQKLPGVYVQPSYRSALMWFGVIFIRHGLYQDGVFKF 
TVY I PDNYPDGDCPRLVFD I PVFHPLVDPTSGELDVKRAFAKWR 
RNHNHIWQVLMYARRVFYKIDTASPLNPEAAVLYEKDIQLFKSK 
WDSVKVCTARLFDQPKIEDPYAISFSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 

♦ 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRWINLHLBKCNPPLEVKDLFVDIQDGKI 
LMALLEVLSGRNLLHEYKSSSHRI FRLNNIAKALKFLEDSNVKL 
VSIDAAEI^GNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LAPGSGGTDSDS SFPPTPTAERSVAI S VKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGSWRSGLAFLAVI KAI DPSLVDMKQALENST 
RENLEKAFS I AQDALH I PRLLEPEDI MVDTPDEQS IMTYVAQFL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRLDGVSSHALSDS 
STEFMHQIIDCVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 
LPI KKTVHFEADTYKDP FCS KNIiSLCFEGS PR VAXES LRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 
CNGALESTARHDEESHSLSPPGENTVMAD^ 

EGD Y F EAI PLKAS KFNSDL I D FASTS QAFNKv PS PHETKPDEDA 
EAFENHAEKLGKRSIKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
IiAPHEDHQQRETKENDPMDSHQSQESPNLENIANPLEENVTKES 

TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAAFIFSYITAVTLHH 
IDPALPYISDTGTVAPEKCLFGAMLNIAAVLCIATIYVRYKQVH 
ALSPEENVI IKLNKAGLVLGILSCLGLS IVANPQKTTLFAAHVS 
GAVLTFGMGSLYMFVQTILSYQMQPKIHGKQVFWIRLLLVIWCG ! 
VSALSMLTCSSVLHSGNFGTDLEQKLHWNPEDKGYVLHMITTAA 
EWS MS FS FFGFFLTYI RDFQKI SLRVEANLHGLTLYDTAPCP IN 
NERTRLLSRDI 


6286 


1619 


27S 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
P FS I PAAS E IADLSN I INKLLKD KNE FHKHVE FD FL I KGOFIiRM 
PLDKHMEMENISSEEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAEEW I LTGS YDKTS R I WSLEGKS IMTI VGHTDWKDVAWVKKD 
SLS CIiLLSASMDQTI LLWEWNVERNKVKALHCCRGHAGS VDS I A 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDHTIRWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVSLSLTSHTGWVTSVKWSPTHEQQLISGSLDNIVKLWDT 
RSCKAPLYDLAAHEDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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SEQ 
ID 
NO: 


Predicted 
beer inn incr 

nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

■ nucxcutiuc 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
tA=Aianine, c=cysteine, D=Aspartic Acid, E= 
uiucamic Acia, r=pnenyialanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
SaSerine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown , *«stop 
Codon, /-possible nucleotide deletion, 

j \=possible nucleotide insertion) 








j TTSHVGA 


6287 


278 


1482 


MQFFFNFQIGLRSTSGKEKYSGDAGFLGDALQLFLQCLALDEDF 
I APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHSFCKNCI1ERCI1DHAPYCPLCKESLKEYI1ADRRYCVT 
QLLEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 
I PTV P C P LH v r E PR YRLiM I RRS I QTGTKQ FGMCV SDTQNS FAD Y 

GCmQIRNVHFLPDGRSVVDTVGGKRFRVLKRGMKDGYCTADIE 
J YLEDV 


6288 


1 


I 743 


VTLYPCRGLVGNLLIX3ASGMASGCKIGPSILNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM • 
K.VGUAI. K.PGTS VE YIiAP WANQ I DMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSLDI EVDG&VGPDTVHKGAEAGANMIVSGSAIM 
| RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 


1 


743 

* 


VTLYPCRGLVGNLLLGASGMASGCKIGPSIIiNSDLANLGAECLR 
MLDSGADYLHLD\^5DGHFVPNITFGHPWESLRKQLGCDPFFDM . 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM | 
KVGLAI KPGTS VEYLAP WANQ I DMALVMTVE PG FGGQKFMEDMM 

PKVHWLRTQFPSLDIEVIX^GPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 

■ 


3 

■ 

* ! 

' * j 

* * 1 

* * 


1856 

i > 

r i 


TLGRWLLGVYETVAPTUVCLPRPRLRRRRRRRRRRMISRYTRKA 

VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 

DITRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 

QNATEKVQTMFTAIDELLYEQKLSVHTKSLQEECQQWTASFPHL 

RILGRQI ITPSEGYRLYPRSPSAVS ASYETTLSQERDSTI FGIR 

GICKLHFSSSYAHKASS I AKSSSFCSMERDEEDS I IVSEGI IEEY 

LAFDHIDIEEGFHGKKSEAATEKQK1jGYPPIAPFYC3^KEDVLAY 

VFDS VWCKWS CMEQLTRSHWEGFASDDESNVAVTRPDSES S CV 

LSELHPLVLPRVPQSKVLYITSNPMSLOQASRHQPNVNDLLVHG 

•MPLQPRNLSLMDKLLDLDDKLUVIRPGSSTILSTRNWPNRAVEFS 

TSSLSYTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEILRGA 

RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 

PHGDSSRAQSAWDEPNYQQPQERLLLPDFFPRPNTTQSFLLDT 

QYRRS CAVEYPHQARPGRGSAGPQLHGSTKSQSGGRPVS RTRQG 

P 


6291 




c no t 
b u z 


Jj VAKJyiAi>b ASARTPAGKR VI NQEELRRLMKE KQRLS TSRKR I E S 
P FAKYNRLGQLS CALCNTP VKS ELLWQTHVLGKQHREKVAELKG 
AREAS QGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
API I PHSGS IEKAE IHEKWERRENTAEALPEGFFDDPEVDARV 

IGEIDEQIECTRRVEKLRNRQDEIKNKLKEILTIKELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6292 


1835 


1142 j 


TCPGAMKMVAPWTRF YSNS CCLCCHVRTGT ILLGVW YL I INAW 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
LICAMATYGAYKQRAAWI I PFFCYQI FDFALNMLVAITVLI YPN 
SIQEYIRQLPPNFPYRDDVMSVNPTCLVLIILLFISIILTFKGY 
LIS CVWN CYRY INGRNS SDVLVYVTSNDTT VLLP P YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 


1035 


FWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEWDKSQVSRTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
I»=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
LEAIDPIiNLGNI CVATVCKVIjLDGYLMI CVDGGPSTDGLDWFCY 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRWHRL 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6294 


354 

• 

< 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLELYSCCljGTDRGFPELSHHC 
KNVI ATAS DYDMAE I TNIRPS FDVSP WAGLI GAS VLWCVSVT 
VFVWSCCHQQAEKKHKNPPYKFIHMLKGIS I YPETLSNKKKI I K 
VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 
LPIKMDYGEELRSPITSLTPGESKTTSPSSPEEDVMLGSLTFSV 
DYNFPKKALVVTIQEAHGliPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDI I KRNIQKCISRGELQV 
SLS YQPVAQRMTWVLKARHLQKMDI AGLSGNPYVKVNVYYGRK 
R I AKKKTHVKKCTLNP I FNES F I YDI PTDLLPDI S IEFLVI DFD 
RTTKNE WGRLI LGAHS VTASGAEHWREVCE S PRKP VAKWHSLS 
EY 


6295 

V 


2795 

• 


617 

• 


, VSSALLTGATSGSDAAKSEGASAS PLSCTNAVAMDRPDEGPPAK 
TRRLSSSES PQRDPPPPPPPPPLLRLPLPPPQQRPRLQEETEAA 
QVLADMRGVGliGPALPPP P PYV1LEEGGIRAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
GALETCSAVGWAPQRLVDPKSKEEAIIIVEDEDEDERESMRSSR 

rrrrrrrrkqrkvkresrernaermes ilqalediqldiieavni 
kagkaflrlkrkfiqmrrpflerrdliiqhipgfwvkaflnhpr 
is ilinrrdedifryltnlqvqdlrhismgykmklyfqtnpyft 
Jnwivkefqrnrsgrlvshstpirwhrgqee^ . 
ffswfsnhslpeadrlaei ikndlwvnplryylrergsrikrkk ! 
qemkkrktrgrcewimedapdyyavedifseisdidetihdik 
isdfmettdyfettdne itdinenicdsenpdhnevpnnettdn 
istesaddhettdnnesaddnnsfjpednnkntddneenpnnnenty 
gnnffkggfwgshgnnqdssdsdneadeasddedndgnegdneg 
sdddgnegdnegsddddrdieyyekviedfdkdoadyedvieii 
sdesveeegieegiqqdediyeegnyeeegsedvweegedsdds 
dledvlqvpngwanpgkrgktg 


6296 


727 


1199 


RHCGCDAQGACDSLPPTGTSSPVTARNAIPEARCCVWLLDGTTV 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
IiGTSLQFPSPFSGTISFGSFSDSiGIFPLGSQCCLGFQQFSISGK 
KWAL IHKRVRLS VFGARWGRI YFGK 


j 6297 


1 


922 


QRAAAAS PS S CGP RG AE YGALMAMEG YWRFLALLGS ALLVG FLS 
VI FALVWVLHYREGLGWDGSALEEWWHPVIiMVTGFVFIQGIAI I 
VYRLPWTWKCSKLLMKS IHAGLNAVAAILAI IS WAVFENHNVN 
NIANMYSLHSWVGLIAVICYLLQIiLSGFSVFLLPWAPLSLRAFL 
MPIHVYSGIVIFGTVIATALMGLTEKLIFSLRDPAYSTFPPEGV 
FVNTLGLLILVFGALIFWIVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRNLALDEAGQRSTM 


6298 


3 


985 


SVPIiRRLSLSGTLQGAGTTTKMAVARIiAAVAAWVPCRSWGWAAV 
PFGPHRGLS VLLARI PQRAPRWLPACRQKTS LSFLNRPDLPNLA 
YKKLKGKSPGIIFIPGYLSYMNGTKALAIEEFCKSLGHACIRFD 
YSGVGSSDGNSEESTLGKWRKDVLS I IDDLADGPQI LVGS S LGG 
WLMLHAAIARPEKWALIGVATAADTLVTKFNQLPVELICKEVEM 
KGVWSMPS KYSEEGVYNVQYS FI KEAEHHCLLHSP IPVNCPIRL 
LHGMKDDIVPWHTSMQVADRVLSTDVDVILRKHSDHRMREKADI 
QLLVYTIDDLIDKLSTIVN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
j P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6299 


512 


814 


ECDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAIIjTSS 
S IDAMDDSAFSGPYKFPFTPPLES FNLCFYTSQVPVPP ILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSSPRLLVSRAAAPSAGPWGAWRQGARA 
AQS PFS IPNSSS VP YGSQDSVHSSPEDGGGGRDRPVGGS PGGPR 
LVIGSLPAHLSPHMFGGFKCPVCSKFVSSDEMDLHLVMCLTKPR 

ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


G KFVP VNW E P PQ PLP F P K YLRC YRCLLETKE LGCLLGS D I CLTP 
AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


745 


I FG FLHLFHMEH S FLLVCALFAHVFFS S SCGS S VALHS D P CLLS 
PVLLNCLPGDLRPIiDELYAQICLKYKAISEELDHALNDMTSL 


6303 


2 


1961 

- •> 


YWNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YWYYLEQFQYWEAQGWT FDAS QS CDTDTYTS KTEADDKNDE KCM 
KVDLVS FLSS P IMGDNDSSGTSDKDHSEILDGI SNI KLNSEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELDIDENPASDFDDSGSLLGFKYGSGQKYGGIPNFS 
HRQVRYLEKNVKLKSKYLDMRRQIKMKNKHIFFTKESEKPFFKK 
SKILSKVEKFLTWVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLLATV 
PDEQDCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPE IAA 
VPELAKYWAQR YRL FSR FDDG I KLDREG WFS VT P EK I AEH I AGR 
VSQSFKCDWVDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAE VYG I AD K I E FI CGD FLLLAS FLKAD WFLS PP WGGP D YA 
TAETFDIRTMMS PDGFE I FRLSKKI TNNIVYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 

c 


1438 

* 


HRARVDRSRES PGGDLRHPGRVRRD I TLSGHPRLSTQH WIJjRE 
DEVGDPGTKDLGHPQHGSPIQETQSEWTLVSPLPGSDMAAIiPA 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQQAEDPTLASGAYQ 
WPGSVEKLQGSVWCDAETLLS SSRTGGQAPPWLTDHDVQMLRLL 
AQGE WDKARVPAHGQVLQVG FSTEAALQDLSS PRLS QLCSQGL 
CGLI KRPGDLPE VLS FHVDRVLGLRRS LPAVARRFHS PLLPYRY 
TDGGARP V I WWAPD VQHLSD PDEDQNS LALGWLQ YQ ALLAHS CN 
WPGQAPCPGIHHTEWARLALFDFIiLQVHDRLDRYCCGFEPEPSD 
PCVEERIiREKCRNPAELRLVH ILVRSS DPSHLVYI DNAGNLQHP 
EDKLNFRLLEGIDGFPESAVKVIiASGCLQNMLLKSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


99 


420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREEDQGAAETQVPDLEADLQELSQSKTGDECGDGPD 
VQGKILTKSEQPKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 

KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSEADKSALMD 

ESEDSGVI PGSHSENALHASEEEEGEGGKAQSSLGYI PLMRWQ 

SVRHTTRKSSTTLREGWVVHYSNKDTLRKRHYWRLDCKCITLFQ 

NNTTNRYYKEIPLSEILTVESAQNFSLVPPGTNPHCFEIVTANA 

TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 

APGHAPHRQASLSISVSNSQIQENVDIATVYQIFPDBVLGSGQF I 

GWYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAILQSLR 

HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 

TKFLITQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 

DFGFAR I IGEKSFRRS WGTPAYLAPEVLLNQGYNRSLDMWSVG 

VIMYVSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 

LINNLLQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 

YITHESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seoment containina siamal Dentide 
(A=Alanine, C=Cvsteine, D=Aspartic Acid, E= 
Glutamic Acid, F=*Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glu t amine , R^Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LAERISVL 


6307 


2136 


589 


C FLLPRGRD P E P PEAGAAAPCAPGAP DMS FRKWRQS KFRHVFG 
QPVKNDQCYEDI RVSRVTWDSTFCAVNPKFLAVI VEASGGGAFL 
VT ,P T i Q KTOR T TlK A V PTVPftHTCJPVLD TDWPPWMnRU T ZV Qf5 «3 PHP 

TVMVWQIPENGLTSPLTEPVWLEGHTKRVGI IAWHPTARNVLL 
SAGCDNWL I WNVGTAEELYRLDSLHPDL I YNVS WNHNGSLFCS 
ACKDKSVRI IDPRRGTLVAEREKAHEGARPMRAI FLADGKVFTT 
GFS RMS ERQLALWDP ENLEEPMALQE LDS SNGALLPF YDPDTS V 
VYVCGKGDSS IRYFE ITEEPPYIHFLNTFTSKEPQRGMGSMPKR 
GLEVS KCE I ARFYKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAALEAEEWVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DS RPAMAPG S SHLGAPASTTTAADAT P SGS LARAGE AGKLEEVM 
OELRALRALVKEOGDRI CRLEEOLGRMENGDA 

\y *—* JJI\XtUiHi 4»I V 4.\ | >1^VJI^1\ a4> Vi\ JJLJ AJy UVIVt lJLlll\J4yi4 


6308 


2 


1118 


GRPTRPEKMIiLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR 
LLSAFLPARFYQALDDRLYCVYQSMVLFFFENYTGVQILLYGDL 
PKNKENIIYIiANHOSTVDWIVADILAIRONALGHVRYVLKEGLK 
WLPLYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 
VIFPEGTRYNPEQTKVLSASQAFAAQRGLAVLKHVLTPRIKATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRFPGKS VNS KLS I KKTLPSML I LSGLTAGMLMTDAGRKti 
YVNTW I YGTLLGCLW VT I KA 


6309 


220 


563 


L VAEVKE PCS LP MLS VE MENKENGS VGVKNSMENGR P PD PAD WA 
VMD WNY FRTVfi FF EOAS AFOEOE I DGKS T J JjMTRNn VLTGLOL 

KLGPALKIYEYHVKPLQTKHLKNNSS 


6310 


36 

• 


979 


G PRCWKFLILS S VN CETLR IG KAW P QS S GQERYWT PRTHS S AS E 
AQRGSLAELNVAAAGLWADCDQPLYDCPMCGL I CTNYHILQEHV 
DLHLEENSFQQGMDRVQCSGDLQLAHQLQQEEDRKRRSEESRQE 
IEEFQKLQRQYGLDNSGGYKQQQLRNME IEVNRGRMPPSEFHRR 

KADMMESLAIjGFDDGTCTKTSGT TEAIiHRYYONAATDVRRVWTi^ <5 
xvnL/i'u ujw xmjjyjt? xsxaxcvx xvx ouxi, i_»n i inn x x yiMruii x/ v xuv v nuoo 

VVDHraSSLGDKGWGCX5YRNFQMl^SSLLQNDAYNDCLKGMLIP 
CIPKIQSMI EDAWKEGFD PQGAS QL 1 1 RLQGTKAW I GACEVY I L 
LTSLRV 


6311 


1 


675 


PVWWNS CEG PRLAAAARTGHGVGRRARLACLGEPRVKAAVMLTL 
AS KLKRDDGLKGSRTAATAS DSTRRVS VRDKLLVKE VAELEANL 
PCTCKVHFPDPNKLHCFQLTVTPDEGYYQGGKFQFETEVPDAYN 
MVP PKVKCLTKI WHPNITETGE I CLSLLREHS IDGTGWAPTRTL 
KDVWGI^SLFTDlxLNFDDPl^IEAAEHHLRDKEDFRNKVDDYI 
KRYAR 

4«4> 4 


6312 


213 


1400 


GDELVKRFAGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQIxAEEMNIAFYTSRTDDILLHQDVDLVCI S I PPPLT 
RQI S VKALG IGKNWCE KAATS VDAFRMVTASRYYPQLMSLVGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
IRGIRHVTSDDFCFFQMLMGGGVCSTVTLNFNMPGAFVHEVMW 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQALRQS FQGQGDRRTWDRTPVSMAAS FEDGL 
YMQSVVDAIKRSSRSGEWEAVEVLTEEPDTNQNLCEALQRNNL 


6313 


2 

* 


2071 


QRSGAARLAFLPS PFS PACVHRSPLS FHGCWFYFVWFMPLGVL 
FHRRRAHGCTLS CSS FVEQPTAMEAEETMECLQE FPEHHKMILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQE P LVE I EG VS KMAFRHL I EFTYTAKLM I QG EE EAND VWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TE S L P S AES EP VE IE VE I AEGT I EVEDEG I BT LE E VAS AKQS VK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
Q VEG I E I VELQ LS HVKDLFHCEKCNRS FKLF YH F KEHMKS HSTE 
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SEQ 
ID 
NO: 


Predicted 
beer inn incr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secruence 


Predicted end 

ruit^T pot" id^ 

location 
corre sponding 
to first 
amino acid 
residue of 
amino acid . 
seguence 


Amino acid segment containing signal peptide 

/ A — JVl an A no 0=fvf o h no T\ Anna >- >- i ^ 7\ ^ J c_ 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X»Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 

\ nnnqn i Kl a nurl Anh i *i r> a <» ■» r*in ^ 
\ a jk/wOO<LWJLC ilULilCUUlUC J.UOC jL LXUXi / 






r 
a 


SFKCE I CNKRYLRESAWKQHLNC YHLEEGGVS KKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYEOQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLW FMQGNELRRHLSDAHN I SERLVTEE VLS VETRVQ 
TEPVTSMTI IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVTjE 


6314 


2 

- 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKLMIQGEEEANDVWKAAE 
FLQML.EAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
y vcuiii j. vci JuWijoilVi\i-fJjr iiUE»AuiNK£>r iSJ-ir xrir ls.£*xiNi\oJtlo i h» 
SFKCE I CNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTI I EQVGKVHVLPLLQVQVDSAQ VTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 


6315 


1 

* 


1015 

, ... .. • . * 
• 


LGIiAVNVVTTLVLISYCPTATEEAPYWTYLLCALGLFIYQSLDA 
IDGKQARRTNS CS PLGEL FDHGCDS L S TVFMAVGAS IAARLGTY 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVI 
VFVLS AFGGATMWD YT I PILEIKLKIL P VLGFLGG VI FSCSNYF 

HVTLHGGVGKNGSTIAGTSVLS PGLHIGLI I ILAIMI YKKSATD 

VFFKWPCT ,ytt .MT?rcr^paT^<;nTrrAA7ZiWMTK' QTPT.VT.rin'mrTrT^'D 
v c Drair v.u x -Li-trir vj\_ v cj\js. v oyrsxi v v /ull'll jnojZiJj x JjvjL/ 1 vf JjVjIt 

GLLFLDQYFNNFIDEYVVLWMAMVISSFDMVIYFSALCLQISRH 

LHLNI FKTACHQAPEQVQVLSS KSHQNNMD 


6316 


; 1503 


792 

9 


VSAGAGTGIMGGTTSTRRVTFEADENENITVVKGIRLSENVIDR 

MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 

EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
AROLE E KDRVT. KTCOD AFYKF OT •APT .PPP <3 <? F. PVR V'TTPOVnTf A 3\ 

EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


839 


PEAQTSAVLARE KGHLPTMRHEAPMQMAS AQDARYGQKDSSDQN 
FDYMFKLLIIGNSSVGKTSFLFRYADDSFTSAFVSTVGIDFKVK 
TVFKNEKR I KLQ I WDTAGQERYRTITTAYYRGAMGF I LMYD I TN 
EESFNAVQDWSTQIKTYSWDNAQVILVGNKCDMEDERVISTERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAI TAAKONTRLKETPPPPOPNCAP 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLEIiRREAPPL 
LGPLLS PFPLPAGSWHRQMLRSSLRFP I TNSAGAPCKAAGRMNI 
LAPVRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLGI PFSLQLWDTAGQERFKCI ASTYYRGAQAI 1 1 VFNLN 
D VASLEHTKQWLADAL KEND PS S VIiLFLVGS KKDLSTPAQ YALM 
EKDALQVAQEMKAEYWAVSSLTGENVREFFFRVAALTFEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRI^QNTLLLGKKVVLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEPLTLEQEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTLGE I EVM I AEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteme, D=Aspartic Acid/ E= 
Glutamic Acid, F= Phenyl a lan me , G=Glycme, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TIiRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGRE KVAMAAVDS FYliYRE IARSCNCYMEALALVGAWYTA 
RKSITVI CDFYSLIRLHFIPRLGSRADLIXQYGRWAVVSGATDG 
IGKAYAEELASRGLNI I L I SRNE E KLQWAKD I ADT YKVETD 1 1 
VADFSSGRE I YLP I REALKDKDVGI LVNNVGVFYPYPQYFTQLS 
EDKLWDI INVNIAAASLMVHWLPGMVERKKGAIVTISSGSCCK 
PTPQLAAFSASKAYLDHFSRALQYE YASKGI FVQSLI PFYVATS 
MTAPSNFLHRCSWLVPSPKVYAHHAVSTIiGISKRTTGYWSHSIQ 
FLFAQYMPEWLWVWGANI LNRSLRKEALSCTA 


6321 


1418 


341 

* 


HRKAALGALMAGRLLGKALAAVSLSLALASVTIRSSRCRGIQAF 
RNSFSSSWFHIjNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 

vpnekvgwlvewqdykpveytavsvlagprwadpqisesnfspk 
fnekdghverksknglyeiengrprnpagrtglvgrgllgrwgp 
nhaadpiitrwkrdssgnkimhpvsgkhilqfvaikrkdcgewa 
ipggmvdpgekisatlkrefgeealnslqktsaekreieeklhk 
lfs qdhlvi ykgyvddprntdnawmeteavnyhdetge imdnlm 
leagddagkvkwvdindklklyashsqfiklvaekrdahwseds 

EADCHAL 


6322 


2047 


1083 


NQEILKNVESSRTVQPHFLEFIiLSLGWSVDVGRHPGWTGHVSTS 
WS INCCDDGEGSQQEE VI SSEDIGAS I FNGQKKVLYYADALTEI 
AFWPSPVESIiTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNS SQRLS PSSRMRKLPQGRP VPPLGPETRVS WWVE 
RYDD IENFPLSELMTE I STGVETTANSSTSLRSTTLEKE VPVI F 
IHPLNTGLFR I KIQGATGKFNMVI PLVDGMI VSRRALGFLVRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 

i 


656 


PASTTDGAQEARVPLDGAFWI PRP PAGSPKGCFACVSKPP ALQA 
PAAPAPEPSASPPMAPTLFPMESKSSKTDSVRAAGAPPACKHLA 
EiOCTMTNPTTVIEVYPDTTEVTOYYLWSIFN^ . 
* LAYSLKVRDKIOjI^I^^ 

MALS VIATHRGLRSSAS ILVAEPHDWNrERPQVTFRERCPAL 


6324 


1 


; 2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 

RPG PGAGAPAGRPEGGG P WARTEGS S LHSEPERAGIjGPAPGTES 

PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 

SELGTTCLWTETGTDGLWTDPHRSDLQFQPEBASPWTQPGVHGP 

WTELETHGSQTQPERVKSWADNLWTHQNSSSLQTHPEGACPSKE . 

PSADGSWKELYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 

QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 

EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 

RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 

KTVLKYS PF WS FRKHYPWVQLSGHAGNFQAGEDGR I LKRFCQC 

EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFNQMEDLLADFEGP 

SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 

EEHAQGAVTKPRYMQWRETMSSTSTLGFRIEGIKKADGTCNTNF 

KKTQALEQVTKYLEDFVDGDHVII^KYVACLEELREALEISPri? 

KTHEWGSSLLFVHDHTGLAKVWMIDFGKTVALPDHQTLSHRLP 

WAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


155 


944 

» 


SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLS EKDRMELLE I AKTNAAKALGTTNIDL.PAS LRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


GEPS PATQQKPSATGAGVLHQHFSSGHIYVLMGLLPPPWTISFT 
VQTTLQPPGGLPAAPVSGRMAFE P VGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRP KS GERLHVTVAPCWEFVLPSVS LTA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalamne, G=Glycine, 
H=Histidine, I=Isoleucine, K=Iiysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutaraine, RssArginine, 
S= Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 

* 


1337 


SIiARIiAPAGGSWMPTQQPAAPSTRAPKPSRSLSGSLCALFSDA 
DSGSGMKAELPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 
GABPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSIiFSHLPQYS 
RQNSLTQFMSIPSSVIHPAMVRLGLQYSQGLVRGSNARCIALLR 
ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPLSASMHN 
AIKFLNKEITSVGSSKREEEAKSELRAAIDRYVQEKIVLAAQAI 
SRFAYQK I SNGDVI LVYGCS SLVSRILQEAWTEGRRFRVWVDS 
RP WLEGRHTLRSLVHAGVPAS YLL I PAAS YVLPE VSTEEKDS KV 
GGEKV 


6328 


1030 


276 


HASAE VTTAAARGLGAMEEEMHTDAKI RAENGTGS S PRGPQCSL 
RHFACEQNIiLSRPDGSASFLQGDTSVLAGVYGPAEVKVSKEI FN 
KATLEVILRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 
TWI^WSDAGSLIACCI^AACmLVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 

I 


3 

* 

> 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGLNNGAGGTSATT 
SNPLSRKIiHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEEFVSIFKEVKEELESISEDVQAMSNCCQDMT 
SRLQAAKEQTQDLIVKTTKLQSESQKLEIRAQVADAFIiSKFQLT 
SDEMSLLRGTREGPITEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GLEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
MEALQDRPVLYKYTLDE FGTARRSTWRGFIDALTRGGPGGTPR 
P I EMHSHDPLR WGDMLAWLHQATASEKEHLEALLKHVTTQGVE 
ENIQEWGHITEGVCRPLKVR IEQ VI VAEPGAVLL YKISNLLKF 
YHHTISGIVGNSATALLTTIEEMHLLSKKI FFNSLSLHASKLMD 
K^LPPPDLGPSSALNQTI^LREVIjASHDSSVVPLD^QMFV 
QVLS CVLDPLLQMCTVSASNLGT^ 

EFTDRRLEMLQFQIEAHLDTLINEQASYVLTRVGLSYIYNTVQQ 
HKPEQGSLANMPNLDSVTLKAAMVQFDRYLSAPDNLLIPQLNFL 
LSATVKEQIVKQSTELVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 

* 


333 


FFYYTFYENKTFSRKMVABKETLSLNKCPDKMPKRTKLLAQQPL 
PVHQPHSLVSEGrrVKAI^KNSVVRGPPAAGAFK^PTKPTAFR 
KFYERGDFPIALEHDSKGNKXAWKVEIEKLDYHHYLPLFFDGLC 
EMTFP YEFFARQGIHDMLEHGGNKILP VLPQLI I PIKNALNLRN 
RQVI CVTLKVLQHL WS AEMVGKALVP YYRQ ILPVLNI FKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 


495 

* 


QQGQRVRTRGRRACASATPLEGCVDLSYPRTHAALIjKVAQMVTL 
L I AF I CVRS S LWTNYS AYS YFEWT I CDLIMI LAF YLVHLFRFY 
RVLTCISWPLSELIiHYLIGTLLLLIASIVAASKSYNQSGLVAGA 
IFGFMATFLCMASIWLSYKISCVTQSTDAAV 


6332 


1 


878 


.VTESNKFDLVSFIPLLRERIYSNNQYARQFIISWILVLESVPDI 

NLLD YLPE I LDGLFQ I LGDNGKE I RKMCEWLGE FLKE I KKNPS 

o V Kr AbMAW X IjV X HCUTTDDIjIQIjTAMCWMkEFIQLiAGRVM 

SSGILTAVLPCLAYDDRKKSIKEVANVCNQSLMKLVTPEDDELD 

ELRPGQRQAEPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 

VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 

KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 

> 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG 
QPLTFSPSGRQPLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPG^4MMSHr1SQASMQPAI J PPGVNSMDVAAGTAS 
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SEO 
ID 
NO: 


p-y*pari i o t" e H 

IT X. G V * 1 V* t_ d U 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino acid 
sequence 


Prpt^i pt~ end 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

53 e mi en r*e 


Amino acid seomPTi t* rnnha i ni na Ri'anal npnti'rfo 

(A=Alanine, C=Cysteine, D=Aspartic Acid,' Es= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, Ns=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon /=oossible nucleotide deletion 
\«possible nucleotide insertion) 






• 

• 


GAKSMWTEHKSPDGRTYYYNTETKQSTWEKPDDLjKTPAEQLLSK 
CPWKEYKSDSGKPYYYNSQTKESRWAKPKELEDLiEGYQNTIVAG 
SLITKSNLHAMIKAEESSKQEECTTTSTAPVPTTEIPTTMSTMA 
AAEAAAAVVAAAAAAAAJyy^\Z^AnA5TSASNTVSGTVPVVPEP 
E VTS I VATWDNENTVT I STEEQAQLTSTPAI QDQSVEVS SNTG 
EETSKQETVADFTPKKEEEESQPAKKTYTWNTKEEAKQAFKELL 
KEKRVPSNAS WEQAMKMI rNDPRYSALAKLSEKKQAFNAYKVQT 
EKK 


6334 


. 17 


644 


GGN P S GRAAG FAAAAM P S S P LRVAWCSSNQNRS MEAHN I LSKR 
GFSVRSFGTGTHVKLPGPAPDKPNVYDFKTTYDQMYNDLLRKDK 
ELYTQNGILHMLDRNKRIKPRPERFQNCKDLFDLILTCEERVYD 
QVVEDLNSREQETCQPVHVV1WDIQDNHEEATLGAFLICELCQC 
IOHTEDMEKEIDELLOEFEEKSGRTFLHTVCFY 


6335 


82 


529 


AARARPGVLCCRIiLGAALGDQSRVEMSYIPGQPVTAWQRVEIH 
KLRQGENLILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
GPAEIAGLQIGDKIMQWGWDMTMVTHDC3ARKRLTKRSEEVVRL 
LVTROS LOKAVOOSMLS 


6336 


1003 


438 


HE PAS KGRAE VGNMRLS VAAAI SHGRVFRRMGLG PES R I HLLRN 
LLTGLVRHERI EAPWARVDEMRGYAE KLIDYGKLGDTNERAMRM 
ADFWLTEKDLI PKLFQVLAPRYKDQTGGYTRMLQIPNRSLDRAK 
MAVIEYKGNCLPPIiPLPRRDSHLTLLNQLLQGLRQDLRQSQEAS 
NHSSHTAQTPGI 


6337 .. 


76 


524 


EGIQMLSVQPDTKPKGCAGCNRKIKDRYI^KALDIOfWHEDCIiKC 
ACCDCRLGEVGSTLYTKANLI LCRRD YLRLFGVTGNCAACSKIiI 
P AFEMVMRAKDNVYHLDCFACQLCNQRF CVGDKF FLKNNM I LCQ 
TDYEEGLMKEGYAPQVR 


OJJO 




1 1 A Q 

> ; •< - ' , 
*»■ 

1 


a P"NTC 1? CP TfY2 D T .DT P JVKTT .TTtOTT? P A W PTVPTT 1 CM Q & TDP Mfl P VIUD 

GLRLALLLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVNAKNY 

KIWFKKYEVIiALLYHEPPEDDKASQRQFEl^ 

KGVGFGLVDS EKDAAVAKKLGLTEVDS M YV^ KGDE VI E YDGE FS 

ADTIVEFLLDVLEDPVELIEGERELQAFENIEDEIKLIGYFKSK 

DSEHYKAFEDAAEEFHPYI PFFATFDSKGAKKLTLKLNEIDFYE 

AFMEEPVTIPDKPNSEEEIVNFVEEHRRSTLRKLKPESMYETWE 

DDMDGIHI VAFAEEADPDGFEFLETLKAVAQDNTENPDLS I IW I 

DPDDFPLLVPYWEKTFDIDLSAPQIGVVNVTDADRLWMEMDDEE 

DLPSAEELEDWLEDVLEGE INTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTKCPORVIITEDDEDETTVELE{^DEN0EGDFEDADT0EGDTE 
S EPYDDEE FEGYEDKPDTSS S KNKDP I T I VDVPAHLQNSWES YY 
LEILMVTGLIAYIMNYIIGKNKNSRl^QAWFNTHRELLESNFTL 
VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMLIQLRFL 
KRQDLLNVLARMMRP VSDQVQ I KVTMNDEDMDT YVFAVGTRKAL 
VRLOKEMODLSEFCSDKPKSGAI^GLPDSLAILSEMGEVTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLKLPDTKR 
TLLLTFNVPGSGNTYPKDMEALLPLMNMVI YS IDKAKKFRLNRE 
GKQKADKNPJ\RVEENFLKLTHVQRQEAAQSRREEKKRAEKERIM 
NEEDPE KQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 


6340 


2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERSFHSSSSS 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGNI KTLGDAYEFAVDVRDFS PED 1 1 VTTSNNHIEVRA 
EICLAADGTV^^^FAHKCQLPEDVDPTSVTSAIJREDGSLTIRARR 

HPHTEH VQQTFRTE I KI 


6341 


2 


645 


KMAVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVAPKPSSRGEYWAKLDDLVNWARRSSLWPMTFGLA 
CCAVEMMHMAAPR YDMDR FGVVFRAS PRQS D VM I VAGTLTNKMA 
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SEQ 
ID 
NO: 


I Predicted 
beginning 
nucleotide 
location 

1 corresponding 

j to first 
amino acid 
residue of 
amino acid 

1 sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I«Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
; S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PALRKVYDQMPBPRYWSMGS CANGGGYYHYS YS WRGCDR I VP 
VDIYIPGCPPTAEALLYGILQLQRKIKRERRLQIWYRR 


6342 


I 2 


1191 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKI LDLTRVLAG P FATMNLGDLGAEVI KVER PGAGDDTR 
TWGPPFVGTESTYYLSVNRNKKSIAVNIKDPKGVKI IKELAAVC 
DVFVENYVPG KLSAMGLG YED IDE IAPHI I YCS ITG YGQTGP IS 
QRAG YDAVASAVSGLMH I TGPEVACLSHI AANYLIGQKEAKR WG 
TAHGS IVPYQAFKTKDGYI WGAGNNQQPATVCKILDLPELIDN 
SKYKTNHLRVHNRKEIiIKILSERFEEELTSKWLYLFEGSGVPYG 
PINNMK^TVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
MSEARPPPLLGQHTTHILKEVLRYDDRAIGELLSAGVVDQHETH 


6343 


2 


936 


GTAMVSDEDELNLLVI WDANP I WWGKQALKES QFTLS KC I DAV 
MVLGNSHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDFFGDP 
GNPPEFNPSGS KDGKYELLTSANEVI VEE I KDLMTKSD I KGQHT 
ETLLAGSLAKALCYIHRMNKEVKDNQEMKSR I LVI KAAEDS ALQ 
YMNFMNVI FAAQKQNIIiIDACVLDSDSGLLQQACDITGGLYIiKV 
PQMPSLLQYLLWVFLPDQDQRSQIjILPPPVHVDYRAACFCHRNL 
IEIGYVCS VCLS I FCNFS P I CTTCETAFKIS LPP VLKAKKKKLK 
VSA 


6344 

■ 

* ! 

i * - - 


2508 

• 

T 


147 

■* ■ • ■ 

* . • . ■■ . • • 


TMPTATLGNLRGYGMASPGLAAPSLTPPQLATPNLQQFFPQATR 
QSLLGPPPVGVPMNPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEA3ELPAKRLRSSEEPTEKEPPGQLQVKAQPQARMT 
VPKQTQTPDLLPEALEAQVLPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQVQPKLQKQAQTQTSPEHLVLQQKQVQPQLQQEAEPQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLLAPEQTPVWHVCGLEMPPDAVEAGGGMEKTLPEP 
. VGTQVSMEEIQNESACX3LDVGBC£3n^ 
LQSSDSI^STVPLTPVPRPSDSVSS^TPAATST^ 
ICKAS CSSQQE FQDHMSEPQHQQRLGE I QHMS QACLLSLLPVPR . 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVprRYFCTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGS ETYS PNTAYGVDFLVPVKGYI CRICHKF YHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


6345 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEMEEMIEQLQEKV 

HELEKQNDTLKNRLISAKQQLQTQGYRQTPYNNVQSRINTGRRK 

ANENAGLQECPRKGI KFQDADVAETPHPMFTKYGNSLLEEARGE 

IRNLENVIQSQRGQI EELEHLAE I LKTQLRRKENE IELSLLQLR 

EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 

QRTLKISHDALMAWGDELNMQLKEQRLKCCSLEKQLHSMKFSER 

RIEELQDRINDLEKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 

QLKVQIAQLETALKSDLTDKTEIIiDRLKTERDQNEKLVQENREL 

QLQYLEQKQQLDELKKRIKLYNQENDINADELSEALLLIKAQKE 

QKNGDLS FLVKVDS E INKDLERS MRELQATHAETVQELEKTRNM 

LIMQHKINKDYQMEVEAVTRKMENLQQDYELKVEQYVHLLDIRA 

ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 

ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 

VRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITLEVHQAYSTEYE 

TIAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 

RVPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPKTAQL 

SSTDSTDGNLNELHITIRCCNHLQSRASHLQPHPYWYKFFDFA 

DHDTAI I PSSNDPQFDDHMYFPVPMNMDLDRYLKSESLS FYVFD 

DSD TQEN I Y I G KVNVPL I S LAHDR C I SG I FE LTDHQ KH P AGT I H 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 






• 


VILKWKFAYLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 
LVLAPRPKPRQRLTPVDKKVSFVD IMPHQSDVS QEGSVDEVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSEDETEITEDLEPEVEED 
MS ASDSDDC 1 1 PGP I SKNI KQPS EKI RIE I IALS LNDSQVTMDD 
TIQRLFVECRPYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAHVDLADMFQEGRDL I EQNIDVFDARADGEG IGKLRVTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
Q I EKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASVV 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDI PSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLI^GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCSWSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDI ITI VSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYS I AGDDS VTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRS L 
ICVGLNEQVLHLWLEVLCS S LPTVEKWYQPWSFLRS PGWVQIKC 
EI^VLCCFAFSLSQDWELPA10*EAQQPLK£GVRDMLVKHHLFSW 
DVDG 


6347 

* 


2921 

• 


533 

<r ■ -I - • 


QDRRLLRLELQKTCQ PTSTMSGSHTPACGPFSALTPS I WPQE I L 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEEKjSSLLANSPLME 
DAPQRLRWQAitoEFra^^ 

GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
Q I EKDLLRTMP SNACFASMG S IGVPRLRRVLRALAWLYPEIGYC 
QGTGM VAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 
TDQRVLRHIiIVQYLPRLDiaL(3EtoIELSLITLHWFLTAFASVV 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDIPSQ^DAELLLGVAMRLAGSLTDVAVETQRRKHIAYL 
I ADQGQLLGAGTLTNLSQVVRRRTQRRKSTI TALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCSVVSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFP AKFVEVLDERS KEYS I AGDDS VTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS PGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 

* 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDL I KSMLRNELQFKEE 
KLAEQLKQAEELRQ YKVLVHS QERELTQLRE KLREGRDAS RS LN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLAWQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beainnina 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

11U\< J» w \J V» J. w 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
irt-hianme , L=uysceine , D=Asparcic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 




* 


ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLSIPPEMLASYKSYSSTFKSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQP YRSAFYVLEQQRVGLAVNMDE I EKYQE VEEDQDPS CPRLSR 

BLLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEB 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEVVEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSIiDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

FavliyuaJbUKCibTPbGCijEIiTDSCQPYRSAFYI 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQPSLGRCYSTPSGYLELPDLGQ.. 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL ' 

LEVVEPBVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH \ 

VGFS LDVGE I EKKGKGKKRRGRRS KKERRRGRKEGEEDQNP PCP 

RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


! 6349 

» 

- » 

V v -. ■ .< 
• 


3 

T , 

• 

« ». ..... I 

t 


3679 

• 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSQEREIiTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN,: 
DDDEDVQVE VAE KVQKS S S PREMQKAEEKEVPEDS LEE CAI TCS 
NSHGPCDSNQPHKNIKITFEEDEWSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR - 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDK^STLIGSSSHVEW . # 
EfcXVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG : 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK { 
, KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS : 
CQPYRSAFYVLBQQRVGLAV^^ _ 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFS LDVGE IE | 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LE WE PEVLQDSLDRC YSTPS S CLEQPDS CQP YG S S F YALEE KH 
VGFSLDVGE I EKKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEHI SFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6350 


3 

• 


3679 

* 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSOERELTOLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRIAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAE KVQKS SAPREMPKAEEKE VP EDS LE 
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SEQ 

-LU 

NO: 


Predicted 
oey inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine i 
S=Serine f T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 




■ 


- 


ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQP YRSAFYVLEQQRVGLAVNMDE IEKYQEVEEDQDPS CPRLSR 
ELLDEKE PE VLQDSLGRCYS TPSG YLELPDLGQP YS SAVYSLEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCX3PYGSSFYALEEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PE VLQDS LDRCYSTPSGCLELTDS CQPYRSAF YILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
P YSSAVYS LEEQYLGLAIiDVDR I KKDQEEEEDQGPPCPRIiSRBL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEHI S FALYVDNRFFTLTVTS LHLVFQMGVI FPQ 


6351 


1291 

• • - -i 

• 


319 


REARRRTERSQLGRMLVVEVANGRS LVWGAEAVQALRERLGVGG 
RTVGALPRGPRQNSRLGLPLLLMPEET^RLLAE IGAVTLVS APRP 
DSRHHSIALTSFKRQQEESFQEQSAIiAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
GPSSSQAGPSNGVAPLPRSALLVQIiATARPRPVKARPLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYI AQCWAPEDT I PLQDLVAAGRLGTS VRKTLLLCS PQ 
PDGKWYTSLQWASLQ 


6352 


235 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARSLVHDTVFYCLS VYQVKIS PTPQLGAAS S AEGHVGQGAPG 
LMGNMNP EGG VNHENGMNRDG3K I PEGGGGNQE PRQQPQP PPEE 
PAQAAMEGPQPENMQPRTRRTKPTLLQVEELESVFRHTQYPDVP 
TRRELAENLGVTEDKVRVWFKNKRARCRRHQRELMLANELRADP 
DDCVYIWD 




65 


/" i 

672 


RFAGAGAIPEARARPPDVQAAEEBKEMDLPDSASRVFCGRILSM 
VNTDDVNAI ILAQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDS I FRRIRTLKGKLARQHPEAFSHI PEAS 
FLEEEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQ PG S P AINGRSQTDDE EMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFC 
HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 
SVQPLSLENIALRGRCQEAWVLSGKQQ IAKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 


6355 


158 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSLD j 
WDGKVSE IKKKI KS I LPGRS CDLLQDTSHLPPKHSDWI VGGGV 
LGLSVAYWLKKLESRRGAIRVLWERDHTYSQASTGLSVGGICQ 
QFSLPENIQLSLFSAS FLRN INEYIjAVVPAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
ALASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCQGEVTRFVSS 

WSAQIAAtiAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVWPHLALRVPAFETLKVQSAWAGYYDYNTFDQNGWGPH 
PLWNMYFATGFSGHGLCX3APGIGRAVAEMVLKGRFQTIDLSPF 
LFTRFYLGEKIQENNII 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSSVTVSTIDEEEEE I EAREV 
ADS YAQNAKVI EKQLERKGMS KRRLQELAEIjEAKKAKMKGTLID 
NQFK 
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SEQ 
j ID 
| NO: 


Predicted 
beginning 
nucleotide 
1 location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=*Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTS ISQWVPVCSRLI PVS PTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPG I D IQLNRKYHTTRKLSTTKDS P 
QPVEEKVGAFTKIIEAMGFTGPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKQEGRSGKYM 
CRI IVHFMWEDVQQRGRVMGVNPYILKKNMILMTNHFYAAILGY 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQIQYL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
AISKTAVAPIERVKLLliQVQHASKQIAADKQYKGIVDCIVRIPK 
EQGVLSFWRGNLANVIRYFPTQALNFAFKDKYKQI FLGGVDKHT 
QFWRYFAGNLAS GGAAGATSLCFVYP LDFARTRLAADVGKS GTE 
REFRGLGDCLVKITKSDGIRGLYQGFSVS VQGI I 1 YRAAYFGVY 
DTAKGMLP D PKNTH I WS WMIAQTVTAVAGVVS Y PFDTVRRRMM 
MQS GRKGAD IMYTGTVDCWRKI FRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


1086 


VCRQEEEKMKEDCrJPSSHVPISDSKSIQKSELLGIiLKTYNCYHE 
GKS FQLRHREEEGTLI IEGLLNIAWGLRRP IRLQMQDDREQVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDAS CMSQRRPKCRAPGEAQR IRRHRFS 
INGHFYNHKTSVFTPAYGSVTNVRVNSTMTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADIiGVEVPHEVAQYIKFEMPVLDSFVEKLKEEEEREI IKLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 

• 


345 


GTRGAVPSTLEEWLPPRSCRVFWIHSGTTMSKVSFKITLTSDP 
RLPYKVLSVPESTPFTAVLKFAAEE FKVPAATSAI I TNDG IGIN 
PAQTAGNVFLKHGSELRI I PRDRVGSC 


6361 


615 

* 


158 


RPGLGQLQHCALAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ 
FKDTLOTPLPDSSPVAVPLGAPIAVASTLSVEHNDGVETGIWAC 
J^RWRRQITSQEFCHFIQGRCTFTPDDGETLH^ 
NSTGIJTOIQETVRKTYVLIL 


6362 

-ft'*.****,*,***.*** 
1 


350 

*- • — • -> , , . vs . ^ . 

■ 

* 


1575 

• 


TTMDGSHSAALKLQQLPPTSSSSAVSEASFSYKENLIGALIAIF 
GHLWS I ALNLQKYCH I RLAGSKDPRAYFKTKTWWLGLFLMLLG 

LRR YVLS FVG CG LAWGTY LL VT FAPN S HE KMTGENVTRH L VS W. 
PFLL YMLVE 1 1 LFCLLLYFYKEKNANNI WI LLLVALLGSMT W 
TVKAVAGMLVLSIQGNLQLDYPIFYVMFVCMVATAVYQAAFLSQ. 
ASQMYDSSLIASVGYILSTTIAITAGAI FYLDFIGEDVLHICMF 
AIX3CLIAFLGVFLITRNRKKPIPFBPYISMDAMPGMQNMHDKGM 
TVQPELKAS FS YGALENNDNI S EI YAPATLPVMQEEHGS RSASG 
VPYRVLEHTKKE 


6363 


21 


1201 


RRTRLGS S FPRRRDS SAME S YDV IANQ P W I DNGSG VI KAG FAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
R YPMEHG I VKDWNDMER I WQYVYS KDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRJCEGY 
DFHSSSEFEIVKAIKERACYLSINPQKDETLETBKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKXIiAPKDVKIRISAPQE 
RLYSTWIGGS I LAS LDTFKKMWVS KKEYEEDGARS IHRKTF 


6364 


21 


1201 


RRTRLGS S FPRRRDSSAME S YDVI ANQP WI DNGSGVI KAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGD I F IGPKAEEHRGLLS I 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPR KNRERAAEVFFET FNVPALFI SMQAVLS LYATGRTTG WLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFE IVKAI KERACYLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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SEQ 
ID 
JNU : 


p real c tea 
beginning 

I1UC160C1C16 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricalCCcu cuu 

nucleotide 

lOCdLlQU 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami rtn ariH cuacrmc 3 t*i t~ ronha l ni na Hicrnal n g±t\ f - i Hp 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
fiT iit-am-l f Ar-id F^Pherwlalanine . GssGlvcine. 

H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutatnine, R=Arginine, 
S=Serine , T=Threonine , V=Valine , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RL YST W IGGS I LAS LDT FKKMWVS KKE YEEDGARS I HRKT F 


6365 


234 


1989 


KHKS RAS CAARAQAFGPS RERE VHS RFRSGLRRLGESNS GCCTM 

ASMGTLAFDEYGRPFLI IKDQDRKSRLMGLEALKSHIM7VAKAVA 

OTMRTSLGPNGLDKMMVDKDGDVTVTNIX3A 

LMVELS KSQDDE IGDGTTGVWIiAGAIiLEEAEQLLDRGI HP I RI 

7V T\r> V007V7\T3X7TV T CUT HVT cnQVT.^rTvT VTVTTrDT.TO , T , BVr r PT JZGIT\7 
AlJt.? I K^AAKVAiarliiiJJVljlJa V J-lVL/J. INJJlILf JLl±^ Xi-VTv. 1 J. LAJOIS. V 

VNS CHRQMAE I AVNAVLTVADMERRDVDFEL I KVEGKVGGRLED 
TKL I KGV1 VDKD FSH PQM PKKVEDAKI AILTC PFE PPKPKTKHK 
LDVTSVEDYKALQKYEKEKFEEMIQQIKETGANIAICQWGFDDE 
AIWLLI^NNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQE IS FGTTKDKMLVIEQCKNSRAVTI FI RGGNKM I IEE 
AKRSLHDALCVIRNLIRDNRVVYGGGAAEISCALAVSQEADKCP 
TLEQYAMRAFADALEVI PMALSENSGMNPIQTMTEVRARQVKEM 
NP ALG I DCLH KGTNDMKQQHVI ETL I GKKQQ I S LATQMVRM ILK 
IDDIRKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFWVLLSIFLGAVAMLCKEQGITVIjGIjNAVFDILV 
I G KFNVL E I VQ KVLHKD KS LENLGMLRNGGLLFRMT LLT S GGAG 
ML YVRWR I MG TGP PAFT EVDNPAS FAD S MLVRAVNYNYY Y S LNA 

QALCSEIX3HKRHILTIX3LGFLVIPFLPASNLFFRVGFVVAERVL 
YLP S VG YCVLLTFGFGAIiS KHTKKKKL I AAWLG I LF I NTLRC V 
LRSGEWRSEEQLFRSALSVCPl^AKVHTOIGKNLADKGNQTAAI 
RYYREAVRl^PKYVHAMimi^NILKERNSLQEAEELIiSLAVQIQ 
PDFAAAWMl^GIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRL YADLNRHVDALNAWRNATVLKPEHS LAWNNM 1 1 LLDNTGNL 
AQAEAVGREALELI POTHSMFSLANVLGKSQKYKESEALFLKA 
IKANPNAASYHGNl^VLYHRWGHLDLAJCKHYEISLQLDPTASGT 
KENYGLLRRKLELMQKKAV . 


63 67 


287. 


1934 

■> f. 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTQEEWALLDPS 
QKNLYRDVMQETFKl^TSVGKTW^ - 
KLCE S KE SHHCGE S FNQIADDMLNRKTL PG I TP CES S VCGEVGT 

«TirtpT MipttT D MTPP1T7C OT3V01T VYTmffDVPXTtf I?/" 1 WAVCVT nClPO 


< 




i 


. GHSS^THI KADi GH^^Jtsj^ajUj^ ^ 

shokactkekpydgTce 

cg kafy f lnlcl i heri htgvkp ykckqcgkaftrs ttlpvher 

TUTY^TTM a FIT? HW PGM A FQ F P QF. T RRHKRS TTTGF.KP YE CKOCGKV 
FISFSSIQYHKMTHTGEKPYECKQCX3KAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCA 
SQl^IHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKQCG KAFRYFS SLH I HERTHTGDKP YE CKVOGKAFTCS SS I R 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAF I RAS S CREHERTHT INR 


6368 


1 


327 


RP VPAKLNPRSWPRTAGALPLRP PPLTMAVFHDEVE I EDFQYDE 
DSETYFYPCPCGDNFS ITKEDLENGEDVATCPSCSLI IKVIYDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRF PTPRGPGSLCHNFCRSAACT VTRTIHGS PREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKNLTSVGKTWKVQNIEDEYKNPRRl^SLmEKLCESKESHHCG 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 
TGHKSSEYQEYGENPYPJTKECKKAFSYLDSFQSHDKACTKEKPY 
IX3KECTBTFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
IHERIHTGVKP YKCKQCGKAFTRS TTLPVHERTHTGVNADECKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCGKVFISFSSIQYHKM 
THTGEKPYECKQCGKAFRCGSHLQKHGRTHTGEKPYECRQCGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seiquence 


Predicted end 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino Qf^Q rn^Tl t* mnfa ■! rri nn ci nnal nonh 4 #4a 

(A=Alanine, C=Cysteine f D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine I=Isoleucine KsLvesine 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S SLH IHERTHTGD KP YECKVCG KAFTCS SSI R YHERTHTGEKP Y 

BCKHCX3KAFISNYIRYHERTHTGEKPYQCKQCGKAFIRASSCRE 
HERTHTINR 


6370 


1711 

- 


329 


FVLSEQRLRTERTWPRSPGLGRGAAAAGARTAGAGLIjRLLIjGCG 

AJJ V VJ\3Lli\r V XCJX XXr /vLNA^iV/^orvl V¥ AUOUI EiJjJlXX Iryjl/iXrlLflj ±15 

IAVSPRSMSELMCPICLDMLKNTMTTKECLHRFCSDCIVTALR 

SGNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSREEYEAHQD 
RVL I RLS RLHNOOAT» T EEGLRMOAMKP AOP VRP P T V>CZ C IDT 

TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SS VGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPS PPGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTIiNGSLTIjELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMN FGTKS FQPRPPDKGS FPLDHLGECKS FKEKFMKC 

LHNNNFENALCRKESKEYLECRMERKLMLQEPLEECLGFGDLTSG 
KSEAKK 


6372 

■ 

• 


2141 


625 

* 


RVSAIASEGKAEERYKIGjEDLLEKSFSLVKMPSLQPVVMCVMKH 
LPKVPEKKLKLVMADKELYRACAVEVRRQIWQDNQALFGDEVSP 

T iT.TTnY T T .1? Y P Q & T .P Q TP T . Q\7T .WNT 1? 1? Q D Q P IfTD D rif2PX n//"VD T .TO M 

VGKlWKLYD^rVI,QFLRTLFLRTPJ^YCTLRAELLMSLHDLDVG 
E I CTVDPCHKFTWCLDACIRERFVDS KRARELQGFLDGVKKGQE 
QVLGDLSM I LCDPFAI NTIALSTVRHLQELVGQETLPRDS PDLL 
liLRLLAl^QGAWDMIDSQVFKEPKMEVELITRFLPMLMSFLVD 
D YT FNVTV5 V T »P &PPTT A P V Q VPMTT ,P P Q PTTT PT .HPOP Ma fi? Vf3T . V 

YVLHITKQPJtfKNAl^RLLPGLV^ 

LADEFAl^DFCSSLFDGFFLTASPRKENVHRHALRLLIHLHPRV 
MSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 

•* ■ \\ • * if* . — 


67 

- * 
* 


711 

• 

; j _n>. f_~ji " *— r Jt — " ~* *■ 


PS RAARAS PARLPAMVS W 1 1 SRLWL I FGTLYPAYYS YKAVKS K 
DI KE YVKWMMYW I 1 FALFTTAETFTDI FLCWFP FYYELKIAFVA 
WLLS P YTKGSSLLYRKFVTIPTIiSSKE KE IDDCLVQAKDRS YDAL 

jmFGraaue^ 

GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASSSGTA 


6374 


53 5 


2105 

a 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHT FCNYTS S T I FLS S TRDHSC PTHTSCNYTSS T I FLS S TRD 
HSCPTHTSCNYTSSTTPLQQTRDHQCPTHTPPNVPPPTTPT.Q^P 

CPAEl^TEGSNGKKEVLSGFQVVLEDTVLFPEGGGQPDDRGTIN 
DISVLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
SGQHL I TAVADHL FKLKTTS WE LGRFRS AI ELDTPSMTAEQVAA 
I EQSVNEKIRDRIjPVNVRELS LDDPEVEQVSGRGLPDDHAGP ir 
VVNIEGVDSIWCCGTHVSNLSDLQVIKILGTEKGKKNRTNLIFL 
SGmVLKWMERSHGTEKALTALLKCGAEDHVFAVKKLQNSTKIL 
QKimLNLLPJ3IAVHIAHSLRNSPDWGGVVILHRKEGDSEFMNI I 
ANEIGS EETLLFL WGDEKGGGLFLLAGPPASVETLGPRVAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKE 


6375 


1 


1535 

• 


AIMAAATRPVRLPEAGCTGRERCWNPSRSRSHSGEGGIiAAWSRT 
C PGRPRRPGQQVVRGPTMLVTAYLAFVGLLASCLGLELSRCRAK 
PPGRACSNPS FLRFQLDFYQVYFIxAIiAADWLQAPYLYKIjYQHYY 
FLEGQIAILYVCGIJVSTVLFGLVASSLVDWLGRKNSCVLFSLTY | 
SLCCLTKLSQDYFVLLVGRALGGLSTALLPSAFEAWYIHEHVER 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVAS WIGLGPVAP 
FVAAIPLLAl^GALALRNWGEOTDRQRAFSRTCAGGLRCLLSDR 
RVLlxlXTIQALFESVIFIFVFLWTPVLDPHGAPLGIIFSSFMAA 
SLI^SSLYRIATSKRYHIjQPMHLLSIxAVLIVVFSLFMLTFSTSP 
GQESPVESFIAFLLIELACGLYFPSMSFLRRKVTPBTEQAG\OxN 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








WFRVPLHSIACLGLLVLHDSDRKTGTRNMFS I CSAVMVMALLAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QDLLIAALGMKLGS PKSSVTI WQPLKLFAYS QLTSLVRRATLKE 
NEQIPKYEKIHNFKVHTFRGPHWCEYCANFMWGLIAQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKHVKKVYSCDLTTLVKAHTO 
WDMC I RE I ESRGLNS EGL YRVSGFS DL I ED VKMAFDRDGE KAD 
ISVNMYEDINIITGALKLYFRDLPIPLITYDAYPKFIESAKIMD 
PDEQLETLHE^KLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGI VFGPTLMRS PELDAMAALNDIR YQRLWELL I KNED I LF 


6377 

* 


2311 


1845 


SRI RRRS SRRPRE P PGPSRRRRRRRPDPRTM P SE KTFKQRRTFE 
QRVEDVRLIREQHPTKIFVIIERYKGEKQLPVLDKTKFLVPDHV 
NMSELI KI IRRRLQLNANQAFFLLVNGHSMVSVSTP ISEVYESE 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAE YHAD I YDKVSGDMQKQGCDCE CLGGGR I SHQSQDKKIHVYG 
YSMAYGPAQHAISTEKI KAKYPDYEVTWANDGY 


6379 


35 


378 


ERAGSPS PSRAALRRCAPQRSQAPRWPDRAACRRSFQGSQGRAY 
IiFNSVVNVGCGPAEERVLLTGLHAVADIYCENCKTTLGWKYEHA 
FESSQKYKEGKYIIELAHMIKDNGWD 


6380 


1414 


462 


PAVQGQRGAGPPTGRGSGNMARFALTWRHGETRFNKEKI IQGQ 
GVDEPLS ETGFKQAAAAGI FLNNVKFTHAFSSDLMRTKQTMHGI 
LERSKFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CPVFTPPGGETLDQVKMRGIDFFEFLCQLILKEADQKEQFSQGS 
PSNCLETSLAEIFPLGKNHSSKVNSDSGIPGLAASVLVVSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFIINFEEG 
RE VKPTVQC I CMNLQDHLNGLTENS LGLNLP S KSNHFEPLKGVP 
LALFTSLLC , . , 


6381 


166B 


218 


AVVRAQGSRGFSGAGWRPRQAAAMNFSEVFKLSSLLCKFSPDGK 
YLASCVQYRLVVRDVNTLQILQLYTCLDQIQHIEWSADSLFILC 
AMYKRGLVQWSLEQPEWHCKIDEGSAGLVASCWSPDGRHILNT 
JHtlFHIjRXraWSLCXKSVSYIKYPKA 




RDCKDYVS I FVCSDWQLLRHFDTDTQDLTGIEWAPNGCVLAVWD 
TCLEYKILLYSIiDGRLLSTYSAYEWSLGIKSVAWSPSSQFLAVG 
SYDGKVRILNHVTWKMITEFGHPAAINDPKIVVYKEAEKSPQLG 
LGCLS FPPPRAGAGPLPSSES KYEIAS VP VSLQTL KPVTDRANP 
KIG IGMLAFS PDS YFLATRNDNIPNAVWVWDI QKLRLFAVLEQL 
SPVRAFQWDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGEGDFA 
VLSLCWHLSGDSMALLSKDHFCLCFLETEAWGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCLIAYPLKGDHGIVDIVDNSDCEPKSKLLRWTTNK 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGNISSQLKHYOTWSMKCHQQQLQRMKENAKHRNQYKFIL 
LENLTS RYEVPCVLDLKMGTRQHGDDAS EEKAANQ IRKCQQSTS 
AVIGVRVCGMQVYQAGSGQLMFMNKYHGRKLSVQGFKEALFQFF 
HNGRYTiRRELLGPVLKKLTEIjKAVLERQES YRFYSS SLLVI YDG 
KERPEWXjDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 
IDFAHTTCRLYGEDTWHEGQDAGYI FGLQSLiIDI vte iseesg 
E 


6383 


3159 


1061 


S PAPGRPS phgsq paaraaaapamps AKQRGS KGGHGAAS PS EK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPASFPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLEEVQQVRRSHQDFSRQREELGQGLQGVEQKVQSLQA 
TFGTFES ILRSSQHKQDLTEKAVKQGESEVS R I SE VLQKLQNE I 
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ID 
NO: 
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beqinnina 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
\A=Aianine ( L.-Lysceine ; D=Asparcic Acia, K= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








uisjjjjoiAjj.tt v ViNX/AKEjKUr I biiiiN 1 VhbRLTELTKS INDNI AI F 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSREWDMEALRSTLQTMESDI YTEVRELVS LKQEQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEELRQLKSDSHGP 
KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLESLLSKSQEHEQRIiAALQGRLEGLGSSEADQDGLASTVRSL 
GETQLVLYGDVEELKRSVGELPSTVESLQKVQEQVHTLLSQDQA 
QAARLPPQDFLDRLSSLDNLPCASVSQVEADLKMLRTAVDSLVAY 
SVKIETNENNLESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


738 




± w e* v v v v_ jj i Hij iiUby y an up JjP p p s S bINEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNE PQKPVS AYALFFRDTQAAI KGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HG TVS A5PQTI^^SLPRS I APKPLTMRLPMNQ I VTSVT I AANMP . 

SNIGAPLISSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ. 
QQQMQQMMQQLQQHQMHCX3IQQQMQQQHFQHHMQQHLQQQQQH. 
LQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 

- -/ 


1584 

> 

-.' 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELSSLGSDSEANGFAERRIDKFGFIVGSQGAEG 
ALEEVPIjEVLRQRESKWLDMLNNTOKWMAKKHKKIRLRCQKGIP 
r o JjKLjKA W\2 i Jjo vavj J\ VKLiQQN P G KFD ELD MS PGDP KWLDV I ERD 

IiHRQFPFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVLLMHMPAEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RWDMFFCEGVKIIFRVGLVLLKHAIjGSPEKVKACQGQYETIER 

ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRIjPLDAPLPGS 

kakpkppkqaqkeqrkqmkgrgqlekppapnqamwaaagdacp 
pqhvppkdsapkdsapqdlapqvsahhrsqesltsqesedtyl : 


6386; 


819 . 


195 


OTCGSFYLGIMQRASRLKRELHMI^TEPPPH3ITCWQDKDQMDDL 

R AOTT/^fZZlWPPYPlffn/PWT CtrrTtlOOVOffDrinnTnoT TnTvirnM 

lr^P^^G^ TC'T.i'nAriiTCLP.PJCGAlJP P.QT.N'Pft'PUT'/PC TQTrTtMff RPWPPD i 

PU^ISSEFKYNKPAFLKNARQWTEKHARQKQKADEEENILDNri 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


6387 


' 1 


662 

i 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KOEIiAETTiATOT.P.PnTYA PPRQ YTJTTYTOMVnMTTDr'wriDVT n*Kir\tr 

NSNS KNDRRNRKFKEAERLFS KSSVTSAAAVS ALAGVQDQLI EK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6388 


1 


662 


PGP THAS ADAWADAWAQ PNMAMHNKAAP PQ I PDTRRE LAE L VKR 
KOE IiAETLANIjEP O I YAPP a q yt .PTYroM y^itj t t p n wno vt tmov 

NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQIilEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDPEIDLKLNKKPRAnY 


6389 


1074 

• 


497 


AE PGDRMAGHRL VXVLGDLH I PHRCNSLPAKFKKLLVPGKI QH I 
LCTGNLCTKESYDYLKTLAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVI PWGDMASIiAIrljQRQFPVDILI SGHTHKFEAF 
EHENKFYINPGSATGAYNALETNIIPSFVLMDIQASTWTYVYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTI AQRDDGVFVQEVTQNS PAARTGWKEGDQIVGATI YFDNLQ 
SGEVTQLLNTMGHHTVGLKLHRKGDRFF PS LGQTWDP 


6391 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCIVISSLVTTQRKLKA 
MSLLGSRNQXiAJ^AVLNPNPMDFCTKDLLTTTSERIIAYLRDFNE 
DQKKA I E TAYAM VKHS PS VAK I CL I HGP PGTG KS KT I VGLL YRL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




■ 




LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKIILEF 
KEKCKDKKNPLGNCGDINLVRLGPEKS INSEVLKFSLDSQVNHR 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENIS KVSKERQEIiASKIKEVQGRPQKTQS I IILESHI ICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRC2N 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
M I S RL P I LQLTVQ YRMHPD I CLFPSNYVYKRNIjKTNRQTEAI RC 

KRKDVSFRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FQGRQKDCVIVTCVRANSIQGS I G FLAS LQRLNVT I TRAKYS L F 
ILGHLRTLMENQHWNQIi IQDAQKRGAI I KTCDKNYRHDAVKI LK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDS KE ITLTVTS KDPERP P VHDQLQDPRLLKRMG I EVKGG 

LS SHKP PVRGE PPAAS PEASTCQS KCDDPEEELCHRREARAFS E 
GEQEKCGSETHHTRRNSRWDKRTLEQEDSSSKKRKLL 


6392 


972 


186 


GRTGVD1^SS^^RLQIR1^TWDVKDTLLRLRHPLGEAYATKAR 
AHGLEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFHLAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 
ECRTRGLRLAVISNFDRRLEGILGGLGLREHFDPVLTSEAAGWP 
KPDPRI FQEALRLAHME P WAAHVGDNYLCD YQGPRAVGMHS FL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGb 


6393 

• 


2017 


730 

• . 


TGGS.KMAAVATCGS VAAS TGSAVATAS KSNVTS FQRRGPRASVT 

LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 

ELPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 

IGPVSSSRFGHYYDASKPJ^PQELIEASNWHGFFLPEKISSTLKV 

EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 

LGS PLWGDDI CCAENGGNSHSLTKFLYVLRGLLRTSLSACI ITM ' 

PTHLIQNKAIIARVTTLSDVWGLESFIGSERETNPLYKDYHGL 

TVSRSSKMDL^SAKRLGPGCG^^GGKKHLDF 


6394,. 


. .1418.. . . 


511 • • 


VLTl^GDALSQADVl^KMPRNNQLLHFAFP^DKQWKLQQIQDAR 
NHVS OAI YLLTSRDOS YO FKTGAE VT »K"T .Mn AVMT ,OT ,TP A P KTP T .T 

TPATLTLPE IAASGLTRMFAPALPSDLLVNVYINLNKLCLTVYQ 
IjHALQPNSTKNFRPAGGAVLHS pgamfe wgsqrlevshvhkvec 
VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYWSYRPF 


6395 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFIQEWALLDSAPJiSLCKYRMLDQCRTLASRGTP 
PCKPS CVS QLGQRAE PKATERG ILRATGVAWESQLKPEELPSMQ 
DLLEEASSRDMQMGPGLFLRMQLVPS I EERETPLTREDRPALQE 
PPWSLGCTGLKAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANILS S PS KRGQKGTLIGYS PEGTPLYNFMGDAFQHSSQS IPRF 

SDGFHMLFDCSALVMGLFAALMSRWKATRIFSYGYGRIEILSGF 
INGLFLIVUVFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNAl^RGVFLHVLADTLGSIGVIVSTVLIEQFGWFIAD 
PLCSLFIAILIFL^WPLIKDACQVLLLRLPPEYEKELHIAIjEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV ; 
TGILKDAGVNNLTIQVEKFAYFQHMSGLSTGFHDVIAMTKQMES | 
MKYCKDGTYIM 


6397 


391 


122 


GAGGVGRFELAIPJVPARMIEWCNDRl^KKVRVKCNTDDTIGDLK 
KL I AAQTGTRWNK I VLKKWYT I FKDHVS LGDYE IHDGMNLELYY 



».r>t**xSM.>»«. if " ur>»r»». •<M»wtf«; 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 

j in-n4,an*uC| c=v»yoi.ciiiC; i^ = r*SparCxC nClu^ H»~ 

1 Glutamic Acid. F=Ph.envlalan*in*» fi-Rl vrinp I 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , ! 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, j 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) j 








o 1 

1 1 


6398 


353 


1306 


HKQMGPLII^RCKKILLPTTVPPATMRIWLLGGLLPFLLLLSGLQ 
RPTEGSEVAIKIDFDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
DIEAQKNYFRMWQKAHIAWLNQGK^^JPQN^f^TTHAVAILFYT^ 
SNVHSDFTRAMASVARTPQQYERSFHFKYLHYYLTSAIQLLRKD 1 
SIMENGTLCYEVHYRTKDVHFNAYTGATIRPGQFLSTSLLKEEA 
QEFGNQTLFTIFTCLGAPVQYFSLKKEVLIPPYELFKVINMSYH 
PRGDWLQLRSTGNLSTYNCQLLKASSKKCIPDP IAIASLS FLTS 
VIIFSKSRV ! 


6399 


75 

* 


1 1245 
1 » ~ ~ j 


RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKKMAECEAEN 
EDIiLKKLELYKEACEGQHKLECDLQQREEErAELQKALSDMQVC 1 
LFQEREHVLRLYSENDRLR I RELEDKKKIQNLLALVGTDAGE VT 
1 YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNIiHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMSKI KQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSEYIKVMSLCRNEWYF3GRVEGIPKNLQFVM | 


6400 


2520 


1053 

* 

! 


KTMKCDEWYEVQSAILRHNCGYAMKTGKFFHNLMERICDFETWL 
DNISVTFLSLTDLQKNETLDHLISLSGAVQLRHLSNNLETLLKR j 
DFLKLIjPLELSFYLLKWLDPQTLIjTCCLVSKQWNKVISACTEVW • 

nTirtfWTfiWnTTM^CUOnaT.UlJTfTniVT.VliTT DMVAT DriUDTV wno I, 

S LIGHSARVYALYYKDGLLCTGS DDLSAKLWDVSTGQCVYGI QT | 
i HTCAAVKFDEQKLVTGSFDNTVACWEWSSGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWAI^AGTCLOTLTGHTEWVTKV j 
VLQKCKVKSLLHS PGDYILLS ADKYE IKIWP IGRE INCKCLKTL 

TPEIANIiALLGFGDIFALLFDNRYLYIMDLRTESLISRWPLPEY 
RKSKRGSS FLAGEAS WLNGLDGHNDTGLVFATSMPDHS IHLVLW 1 
KEHG " 1 


6401 


109 

* 


766 1 

♦ * 1 


PGAAWSRPDLRGCCTGPQPALRMLVLPSPCPQPLAFSSVETMEG 
PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTQGVPYTVL ] 

em v ivri. uv^wi V-Vj r\-rvr ivrvrton i irixvn.no xriiJ/^VjljoJxJrrlvjv^r Xil.lrrvr\.r 

RDAGEIAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6402 


1196 

* 


279 | 


TTSQCGGIRQSSAIPVASMEFT^AICLRNAIiLLLPEEQQDPKQEN 
GAKNSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLRKQE 
LENLKCS ILACSAWALALGDNLMALNHADKLLQQPKLSGSLKF 
IiGHLYAAEALrSLDRISDAITHLNPENVTDVSLGISSNEQDQGS 
DKGENEAMESSGKRAPQCYPSSVNSARTVMLFNLGSAYCLRSEY 
DKARKCLHQAASMIHPKEVPPEAI LLAVYLELQNGNTQLALQ 1 1 1 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK j 


6403 


2 


1690 


RG IHTS VLQGNLQNQMYSHNVVIMNLNNLNLTQVQQRNL I TNLQ 
RSVDDTSQAIQRIKNDFQNLQQVFLQAKKDTDWLKEKVQSLQTL | 
A7U!WSAIAKANNDTLEDMNSQLNSFTGQMENITTISQANEQNLK 
DLQDLHKDAENRTAI KFNQLEERFQLFETDIWI ISNISYTAHH 

LRTliTQ^iNFVPTTPTnTT.T?n4TnnT.TQT.M>rPT.7kTJTOT.nC^7GT.'D 1 
jjxv. iui ij lh xjv* Ct v f\. 1 J. L I U 1 iJl i\Xi J. L/JJia X Oi-lLNIN X lxrtJN X-KJjlJo VpliK 1 

MQQDLMRS RLDTEVANLS VIMEEMKLVDSKHGQL I KNFT ILQGP | 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP | 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWKNFTDKCYY 
FS VEKE I FEDAKLFCEDKS SHLVF INTREEQQW I KKQMVGRESH 
WIGLTDSERENEWKWLDGTSPDYKNWKAGQPDNWGHGHGPGEDC 
AGL I YAGQWNDFQCEDVNNF I CEKDRETVLSSAL | 


6404 


1012 


222 j 


AAAIjAMAAPAPGLI SVFSSSQELGAAIiAQLVAQRAACCIiAGARA | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=beucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALGLSGGSLVSMLARELPAAVAPAGPASLARWTLGFCDERLV 
PFDHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
KKLRQAFQGDS I PVFDLLILGVGPDGHTCS LFPDHPLLQEREKI 
VAP I S DS P KP PP QRVTLTLP VLNAARTV I F VATGEGKAAVLKR I , 
LEIXJEENPLPAALVQPHTGKLCTFLDEAAARLLTVPFEKHSPL 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAYATLRNHRPR 
TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMELLQE 
AGVS VP KG YVAKS PDEAYAIAKKLGSKDWI KAQVLAGGRGKGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VLVCERKYPRREYYFAITMERSFQGPVLIGSSHGGVNIEDVAAE 
TPEAIIKEPIDIEEGIKKEQALQLAQKMGFPPNIVESAAENMVK 
LYSLFLKYDATMIEINPMVEDSDGAVLCMDAKINFDSNSAYRQK 
KI FDLQDWTQEDERDKDAAKANXjNYIGLDGNIGCLVNGAGLAMA 
TMDIIKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLAIL 
VNI FGGIMRCDVIAQG I VMAVKDLE I KI PVWRIiQGTRVDDAKA 
LIADSGLKILACTDLDEAARMWKLSEIVTLAKQAHVDVKFQLP 
I 


6406 

- 


1036 


167 

• 


HPRQMRGEDTPEAPPYSSGRYDS I KTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDS EGMDPERLKAFNMFVRLFVDENIiDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTRPTPPHLTSAMAENILAAACESETRKAAKRMRLEIYQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARALASRPAPSWV 
CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCLAVS QTVLAQLDALLVFPGQVAQLS CTLS PQHVTIRDYGV 
S WYQQRAGSAPRYLLYYRSEEDHHRPADI PDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


6408 

* 


1458 


903 


RGCITSSQAWRLFX3GVTRGFNMRIEKCYFCSGPIYPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
S FEFEKRRNEP I KYQRELWNKT I DAMKRVEB I KQKRQAKF IMNR 
LKKNKELQKVQDI KEVKQNIHL I RAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


- 150- • > 


■ 
• 


GPACLPTKTFRSYLPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCYACLGCWGQS S S P KAAF PAGS ACLPADS CPCLLFQAC 
AISGLFNCITIHPLNIAAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAEKVDRLRS WQKAVF YCGMAWP I VT SLTLTTLLGNA I AFA 
TGVLYGLSALGKKGDAI SYARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAGIAVIjFKKKFGGVQELLNQQKKSGEVAVLKRDGRYIYYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVSAMIEEVFEATDI KITVYTL 


6412 


61 


1709 

* 

• 


RPVTSFSPLPGSCGGRLGTRTMLGRSLREVSAALKQGQITPTEL 
CQKCLSLIKKTKFLNAYITVSEEVALKQAEESEKRYKNGQSLGD 
LDGI P IAVKDNFSTSGI ETTCASNMLKGYIPPYNATVVQKLIiDQ 
GALIiMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPS YGLVSRHGL I PLVNSMDVPGI LTRCVDDAAI V 
LGALAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLV 
PELSSEVQSLWSKAADLFESEGAKVIEVSLPHTSYSIVCYHVLC 
TSEVASNMARFDGLQYGHRCD I DVS TEAM YAATRREG FNDWRG 
RILSGNFFLLKENYENYFVKAQKVRRL IANDFVNAFNSGVDVLL 
TPTTLSEAVPYLEFIKEDNRTRSAQDDI FTQAVNMAGLPAVS I P 
VALSNQGLPIGLQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDDCSAVLENEKLASVSLKQ \ 
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SEQ 
Tn 

NO: 


Predicted 
oey i lining 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ciiuj.no acia 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Coaon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDLEPYMDENFI SRAFATMGETVMSVKI IR 
NRLTGI PAGYCFVEFADLATAEKCLHKINGKPLPGATPAKRFKL 
NYATYGKQPDNS PE YSLFVGDLTPDVDDGMLYEFFVKVYPS CRG 
GKVVLDQTGVSKGYGFVKFTDELEQKRALTECQGAVGLGSKPVR 
LS VAI P KASRVKPVE YSQMYS YS YNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYOYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEBLYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPRPQPARPSSRATPGPRSPGMATSIGV 
SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSWKDCIHAV 
LKEELANAEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 
VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCVVAAFGC 
FYY 


b41b 


2 


1168 


FVRQWQS SHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTES S S VSEDGDS SEMDDEDCERRRMECLDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQIRTKVAGIYRELCLESVKNKYECEIQASRQHCESEKLLLYD 
TVQSELEEKIRRLEEDRHSIDITSELWNDELQSRKKRKDPFWPD 
KIOCPGWSGPYIVYMLQDIjDILEDWTTIRKAMATLGPHRVKTEP 
PVKLEKHLHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


6416 


410 

1 


1519 


EI APADLE I PACAPVLLSRATSSTMSVTGGKMAPSLTQEILSHL 
GLASKTAAWGTLGTLRTFLNFSVDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAALSGNLERIVM 
ALLQPTAQ FDAQELRTALKASDSAVDVAIE I LATRTPPQLQECL 
AVYKHNFQVEAVEJGITSETSGILQDLLLALAKGGRDSYSGIIDY 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELiEEAVQNRFHGDAQVALLGIiAS VIKNT PLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCQSALLALCRAEDM 


6417 


■ 1 ■ ' 


845 


RG E SRVLWS ELEGEAGGAGGWAS S LNARMDNRFATAFVI ACVLS 
L I STI YMAAS IGTDFWYE YRSP VQENSSDLNKS I WDEFI SDEAD 
„,EKT,YNI^ EPERTBSEDWT^ 
KCVS FTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLP FVSL 
GLMCFGALIGLCACICRSLYPTIATGILHIiLAGLCTLGSVSCYV 
AGIELLHQKLELPDNVSGEFGWSFCLACVSAPLQFMASALFIWA 
AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
TPAPPPPPPCGGIACHGEPAKFYGYDNIiQRQP I FTTQQEAELVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLEtiEKEFLFNPYLTRKRRIEVSHALALTERQVKIWFQ 
NRRM KWKKEJn NKD KFP VS RQE VKE3GET KKEAQEL E EDRAEGLTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRSIQIPANRSKTAMSKCPIFP 
MARSISTSGPLDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 
PSPDPVTVPYLSPLWWKELESLLENEGDHAITVADFVDHHPIV 
FWNLVWYFRRLDLPSNLPGLILSSEHCNKYSKIPRHCMSEDSKY 
VLIQMLWDNMKLHQDPGQPLY I LWNAHTQKYPMVHLLQKSDNS F 

LFLSLVAIjGRENIDIDAFDKEYKMAYDRLTPSQVKSTHNCDRPP 
STGVMECRKTFGEPYL 


6420 


207 


1187 

♦ 


RKMIDKiMQTCGVGQDSVPYMICLIHILEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTI FLLYLTI I FLHI YKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALI IFYH 
GAI P I DFYYFMAKI FIHKGRTCRWADHFVFKI PGFS LLLDVFC 
ALHGPREKCVE I LRSGHLLAI S PGGVREALI SDET YN I VWGHRR 
GFAQVAIDAKVPIIPMFTQNIREGFRSLGGTRLFRWL.YEKFRYP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, X^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R-Arginine, | 
S=Serine, T=Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=>possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGGFPVKLRTYLGDP I P YD PQI TAEELAEKTKNAVQALID 
KHQRI PGNIMSALLERFH 


6421 


■ 1844 

• 


362 


WAIiSLRRQPERMSNKLLSPHPHSVVLRSEFKMASSPAVLRASRL 
YQWSLKS S AQFLGS PQLRQVGQ I IRVPARMAATLILEPAGRCCW 
DEPVR IAVRGLAPEQPVTLRASLRDE KGALFQAHARYRADTLGE 
IiDLERAPALGGS FAGLE PMGLLWALE PEKPLVRLVKRDVRTPLA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMAS FIiKG ITAAWINGS VANVGGTLR YKGETL P PVGVNR 
NRIKVTKDGYADIVDVLNSPLEGPDQKSFIPVERAESTFLFLVG 
QDDHNWKSEFYANBACKRLQAHGRRKPQI ICYPETGHYIEPPYF 
PLCRASLHALVGSP 1 1 WGGE PRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 


6422 


181 


2133 


EGENIiSWFQEFWGDIAKEFYWKTPCPGPFLRYNFDVTKGKIFIE 
WMKGATTNICYNVLDRNVHEKKIiGDKVAFYWEGNEPGETTQITY 
HQLLVQVCQFSNVLRKQG1HKGDRVAI YMPM I PELWAMLACAR 
IGALHSIVFAGFSSESLCERILDSSCSLLITTDAFYRGEKLVNL 
KEIADEALQKCQEKGFPVRCCIVVKHLGRAEIiGMGDSTSQSPPI 
KRSCPDVQI 3WNQGIDLWWHELMQEAGDECEPEWCDAEDPLF IL 
YTSGSTGKPKGVVHTVGGYT^YVATTFKYVFDFHAEDVFWCTAD 
iGWITGHSYVTYGPLANTGATSVLFEGIPTYPDVNRLWSIVDKYK 
VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
PPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETTYFKKFPGYYWGIX3CQRDQDGYYWITGRIDDMLNVSGHIj 
LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
F3PKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6423 


614 


1237 


ANLrKE I PRDLPPETVLL YLDSNQ I TS I PNEI FKDLHQLR VLNLS 
KNGI EFIDEHAFKGVAETLQTIJDLSpNRIQSVHKNAFNNLKARA 
RIANNPWHCDCTIiQQVLRSMASNHETAHNVICKTSVLDEHAGRP 

QEDARRi5,EYX,KSLPSRQKJCADEPDDI STW 


6424 


1 


1188 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWVVAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 
LKGSKGKDWEIPVPVGISVTDENGKIIGELNKENDRILVAQGGL 
GGKLliTNFIiPLKGQKRI IHLDLKLIADVGLVGFPNAGKSSLLSC 
VSHAKPAIADYAFTTLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGHKFLKH I ERTRQLLFWD I SGFQLSSHTQYRTAFETI I 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNMI PERTVEFQHIIP I S AVTGEGI EELKNCIRKSL 
DEQANQENDAIiHKKQLLNLW I SDTMS S TEPPS KHAVTTSKMD 1 1 


6425 


1850 


1144 


LAMEGGGGIPLETLKEESQSRHVLPASFEVNSLQKSNWGFLLTG 
LVGGTLVAVYAVATPFVTPALRKVCLPFVPATMKQ I ENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKWSFTAVGYELNPWLVWYSRYRA 
WREGVHGSAKFYISDLWKVTFSQYSNWIFGVPQMMLQLEKKLE 
RELEDDARVIACRFPFPHwTPDHvTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLP IQA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFSFSPEPTLEDIRR 
LHAEFAAERDWEQFHQPRNLLLALVGEVGELAELFQWKTDGEPG 
PQGWS PRERAALQEELSDVL I YLVALAARCRVDLPLAVLSKMDI 
NRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADI PCDSTGQT 
ST 


6427 


145 


959 


AAS WG P PHVPKAGKMVSWM I CRL WLVFGMLjCPAYAS YKAVKTK 
NIREYVRWMMYWI VFALFMAAE IVTDI FI S WFPFYYE I KMAFVL 
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SEQ 
ID 
NO: 

■ 


Predicted 
beginnxng 
nucleotide 
location 
corre spondi ng 
to first 
ammo acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, YVTyrosine, X»Unknown, *=Stop . 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


* 






WLLS P YTKGASIiLYRKFVHPSLS RHEKEI DAYI VQAKERS YETV 
LS FGKRGLNIAASAAVQAATKSQGALAGRLRS FS MQDLRS I SDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VPRAPARPREKPLIRSQSLRWKRKPPVREGTSRSLKVRTRKKT 
VPSDVDS 


; 6428 


1982 


444 


SGSGGKMEDHQHVP ID I QT5KLLDWLVDRRHCS LKWQSLVLTIR 
EKINAAIQDMPESEEXAQIiIiSGSYIHYFHCLRII»DLLKGTEAST 
KN I FGR YS S QRMKDWQE 1 1 ALYE KDNTYLVELS S LLVRNVNYE I 
PSLKKQIAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELLAIjVKDLPSQLAEIGAAAQQSLGEAIDVYQASVGF 

vcespteqvlpmlrfvqkrgnstvyewrtgtepswerphleel 

peqvaedaidwgdfgveavsegtdsgisaeaagidwgifpesds 

kdpggdgidwgddavalqitvleagtqapegvargpdaltlley", 

tetrnqfldelmelei flaqravelseeadvlsvsqfqlapail 

qgqtkekmvtmvsvledligkltsiiqlqhlfmlljispryvdrvt 

eflqqklkqsqllalkkelmvqkqqealeeqaaiiepkldlllek 

tkelqklieadiskrysgrpvnlmgtsl 


6429 

» 


3413 


3442 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTL 
WKSVTDKIDAGDYLCTARNKVGCDYVVLKVDVVMKPAKI EHKEE 
NDHKVFYGGDLKVDCVATGLPNPE I S WS LPDGSLVNSFMQSDDS 
GGRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 
VKWTAPAT I RNKTCLAVQVP YGDWTVACEAKGE PMPKVTWLS 
PTNKVIPTSSBKYQIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE * 
DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 
IPTPRVLWAFPEGWLPAPYYGNRITVHGNGSLDIRSLRKSDSV 
QLVCMARNEGdEARLIVQLTVLEPMEKPIFHDPISEKITAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 
X SGLS S VDAGAYRCVARNAAGHTERLVSLKVGLKPEANKQYHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 
LDNGTLTVREAS VFDRGT YVCRMETE YGPSVTS I PVI VI AYPPR 
ITSEPTPVI YTRPGNTVKLNCMAMGI PKADITWELPDKSHLKAG ; 
VQARLYGNRFLKPQGSLTIQHATQRDAGFYKCMAKNILGSDSKT 
TYIHVF. , ....... ..... . , *■ 


6430 1 


1946 


602 


RTRVS TGLRJRTLLWS EAVGAS STRGDTG I PGSGEGGAGPGGGEG - 

AMLEAMAEPSPEDPPPTLKPETQPPEKRRRTIEDFNKFCSFVLA 

YAGYI PPSKEESDWPASGSSSPLRGESAADSDGWDSAPSDLRTI - 

QTFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 

KLKDSLFDLDGPKVASPLSPTSLTHTSRPPAALTPVPLSQGDLS 

HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 

KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEEEEEEEEEEEMA 

TWGGEAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 

TSQDGDASSSEGEMRVMDEDIMVESGDDSWDLITCYCRKPFAGR 

PMI ECSLCGTWIHLS CAKI KKTNVPDFF YCQKCKELRPEARRLG 

GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPEPV 
I EEVDIjANLAPRKPDWDLKRDVAKKLE klkkrtqrai aeli rer 
LKGQEDS LASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
D YSDQE VLQTLTKFCFPFYVDSLTVSQVGQNFTFVLTD I DSKQR 
FGFCRLSSGAKSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WKELLETLHKLP I PDPGVSVHLS VHS YFTVPDTRELPS I PENRN 
LTE YFVAVDVNNMLHLYASMLYERRILI I CS KLSTLTACIHGSA 
AMLYPMYWQHVYI PVLPPHLLDYCCAPMP YL IGIHLSLMEKVRN 
MALDDWILNVDTNTLETPFDDLQSLPNDVIS9LKNRLKKVSTT 
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SEO 
ID 
NO: 


Predict - 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rxcu;i.L.Lc;u cuu 

nucleotide 
location 
corresponding 
to first 
amino acid 

JLCOIUUC UJ< 

amino acid 
sequence 


Amino acid ssomeilfc con fc a i x\\ ncr qifmal r*f»ot~ t c\c* 

(A=Alanine, C«Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=GIycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Ser*in© T^Threonine . V=Vali no. 
W=Tryptophan, Y=Tyrosine, X-Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








TGDGVARAFliKAQAAF FGS YRNALK I EPEB P I T FCEEAF VSH YR 
SGA^QFLQNATQLQLFKQFIDGRLDLLNSGEGFSDVFEEEINM 
GEYAGSDKLYHQWLSTVRKGSGAILNTVKTKANPAMICrVYKPDI 
AENGCAPTPEEQLPKTAPS PLVEAKDPKLREDRRP ITVHFGQVR 
P PP PWUW P P IT QTJT A VFHP P T 3 VP Q P RflNT T AT P ATT .W T T .r>K"Q T 

THFAAKFPTRGWTSSSH 


6433 


1524 


484 


AP VTKRKEVFAKDS KGSALDAGRDPKRPALP E TLCESG WASNTA 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 
LPLLLLLVATTGPVGALTDEEKRLMVEIJKrLYRAQVSPTASDML 
HMRWDEELAAFAKAYARQC VWGHNKE RGRRGENLFAI TDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQVVWAKTERIGCG 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEP I GSPEDAQDLP YLVTEAPS FRATEASDSRKMG 
AEGPDKPSWSGl^SGPGHWGPLl^LLLLPPLVIiAGIF 


6434 


40 


2002 


MPQLNFGMAD P TQMGGLS MLLLAGEHALGTP E VFSGTCRP DVSE""1 
S PELRQKS PLFQFAEIS SS TSHSDAS TKQCQTS ALFQFAE ISSN 
TSQLGGAEPVKRCGKSALFQLAEMCLASEGMKMEESKLIKAKES 
DGGRIKELEKGKEEKEIKMEKTDETRLQKEAEFEKSAKENLRDS 
KELRNFEALQ I DD IMAI KMEDPKE I RKE ELEEDHKCS HFPDF S Y 

KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIEAVAKGDWG 
IEKMDTPRKKVRTSSSGKGSILl^KPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFISISASKNISGETPEGIKAEPLTPMEDALPPS 
LSGQAKPEDSDCHRKIETCGSRKSERSCKGALYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 
TKPKEDCLLGSAKLDEE FE KKFNSLPQ YS PVTFDRKCVPVPRKK 
KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 
MEPVHKVKNIPS I FNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQPJ^GYHHEEVLWMTNLMIWCGGVYLKQLRHTAMTNA 


6435 


2227 

r - V . v - . .. • • '". - 


657 

■ 


ALQRDAAAAYAHPEYEERFLQEETVSQQINSIELLQTRPIALPE 
VVKSQRPLQRQVHLRGRPASQPTVIRGITYYECAKVSEEENDIEE 
y \juct r r buUNtj V U Jjjj 1 £, LKJ JjLilcrilN taioW J. £j V 1 KKirAA 1 Kytjriis X A 
VTSDLNARTAPWSSAiPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTRESVIiQPSPQVPATTVAHTATQOPMPAPPAVSP 
HTVPVPPTT VRTDS LGKD APAGRGTT PAS PTLSPEEEDDIRNVI 
GRCKOTLSTITGPTTQNTYGRl^GAWMKDPLAKDERIYVTNYYY 
GNTLVEFRNLENFKQGRWSNS YKLPYSW IGTGHWYNGAFYYNR 
AFTRN 1 1 K YDL KQR Y VAAWAM LHD VA Y E EAT P WRVJ QGHS D VD FA 
VDENGLWLIYPALDDEGFSQEVIVLSKlxNAAD^STQKETTWRTG 
LRRNFYGNCFVTCGVT.YAVDS YNORNAN I S YAFDTHTNTOI VPR 

UlV4\lv X X Wll V*»X V .X, \*\J V JJ X V JJlJ X X* ^iuvriii J* 1/14 *J X 4^ X 4>l X V XT XV 

LLFENEYF YTTQI DYWPKDRLLYAWDNGHQVTYHVI FAY 


6436 


1295 


341 


GACRPPVRQDPDSGPDYEALPAGATVTTHMVAGAVAG I LEHCVM 
YPIDCVKTRMQSLQPDPAARYRNVLEALWRI IRTEGLWRPMRGL 
NVTATGAGPAHALYFAC YEKLKKTLSDVI HPGGNSH IANGAAGC 
VATLLHD7VAMNPAE WKQRMQMYNS P YHRVTDCVRAVWQNEGAG 
AFYRSYTTQLTMNVPFQAIHFMTYEFLQEHFNPQRRYNPSSHVL 
S GACAGAVAAAATT P LDVCKTLLNTQESIALNS HITGH I TGMAS 
AFRTVYQVGGVTAYFRGVQARVI YQ I PSTAIAWSVYEFFKYLIT 
KRQEEWRAGK 


6437 


1828 


360 


PPAPAPPASPARHVTRTARGHLEGGSRAPPLLQAVFljQIKNMVK 
LIHTLADHGDDWCCAFSFSlXATCSLDICriRLYSIiRDFTELPH 
SPLKFHTYAVHCCCFSPSGHI1^SCSTIX3TTVLWNTENGQMIAV 
MEQPSGSPVRVCQFSPDSTCLA^GAADGTVVLWNAQSYKLYRCG 
SVKDGSLAACAFS PNGS FFVTGSS CGDLTVWDDKMRCLHS EKAH 
DLGITCCDFSSQPVSDGEQGLQFFRLASCGQDCQVKIWIVSFTH 
Il^FELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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SEQ 
ID 
NO: 

■ 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine , M=Me thionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serme, T=Threomne , V=»Valme, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTUjLATGSMDKTVNIWQFD 
IiETLCQARS TEHQLKQFTEDWS EEDVSTWLCAQDLKDLVG I FKM 
NNI DG KELLNLTKES LADDLKI ESLGLRSKVLRKIEELRTKVKS 
LSSGIPDEPICPITRELMKDPVIASDGYSYEKEAMENWDPAKRN 
RTSPP 


6438 


109 


901 


EVQILRAKMFQTGGLIVFYGLLAQTMAQFGGLPVPLDQTLPIiNV 
NPALPLSPTGLAGSLTNALSNGLLSGGLLGILENLPLLDILKPG 
GGTSGGLLGGI»LGKVTSVI PGLNNI IDIKVTJDPQLLELGLVQSP 
DGHRL YVTI PLG I KLQ VNTPLVGAS LLRLAVKLD I TAE I LAVRD 
KQERIHLVLGDCTHS PGS LQISLLDGLGPLP I QGLLDSLTG ILN 
KVLPELVQGNVCPLVNEVLRGLDITLVHDIVNMLIHGLQFVIKV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARKRKARRL 
KQAKEEAQMEVEQYRREREHEFQSKQQAAMGSQGNLSAEVEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


517 


RARWNSDMGDLPGLVRLS IALRIQPNDGPVFYKVDGQRFGQNRT 
IKLLTGSSYKVEVKIKPSTLQVENISIGGVLVPLELKSKEPDGD 
RWYTGT YDTEGVTPTKSGERQP I Q I TMPFTD IGTFETVWQVKF 
YNraKRDHCQWGSPFSVIEYECKPNETRSLMWVNKESFL 


6441 


j 234 


1373 

i 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 
RNVCKENSTVGMKIQEELQRSGGLDHIiVLSPGEWPVSDNTIMHI 
ATAEALTTDYWCLDDLYREMVRCYVE I VEKLPERRPDPATIEGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 
MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 
DSENKAI FPDNYDAEEREKTYRKWS SEGRGGRRGHDAPM IAYDA 
LLAAGNSWTELCHRAMFHGGE SAATGTI AGCLFGUjYGLDLVPK 
GLYQDLEDKEKLEDIjGAALYRLSTEEK 


6442 

; 


>• • * - '-. 


796 


AEDPAGGIiAGQDTMFARGLKRKCVGHEEDVEGALAGLKTVSSYS 
LQRQSLLDMSLVKLQIXHMLVEPNLCRSVLIANTVRQIQEEMTQ 
IXtTWRTVAPQAAERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 

ghtqgpvsdlcpvtsaqaprhlqssawemdgprenrgsfhksld 
qifetletknpscmeelfsdvdspyydldtvltgmmggarpgpc 
egleglapatpgpssscksdlgsldhweilvet 


6443 


2 

• 


555 


maspaassvrpprpkkepqtlvi pknaaeeqklklerlmknpdk 
avpipekmsewaprpppefvrdvmgssagagsgefhvyrhlrrr 
eyqrqdymdamaekqkldae fqkrleknkiaaeeqtakrrkkrq 
klkekkliakkmkleqkkqegpgqpkeqgssssaeasgteeeee 
vpsftmgr 


6444 


390 


899 


gstprgkmrapipepkpgdlieifrpfyrhwaiyvgdgywhla 
PPS EYAGAGAASVMSAIjTDKAI vkkellydvagsdkyqvnnkhd 

DKYSPLPCSKIIQRAEELVGQEVI.YKLTSENCEHFVNELRYGVA 
RSDQVRDVI IAAS VAGMGLAAMSIilGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARSPRPQAHTKGVRGLPSRRRSPDCGRMELAAGSF 
SEEQFWEACAELQQPALAGADWQLLVETSGISIYRLLDKKTGLY 
EYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
TVVYWEVKYPFPMSNRDYVYLRQRRDLDMEGRKIHVILARSTSM 

SWL INWAAKNGVPNFLKDMARACQNYLKKT 


6446 


1 


1651 

• 


RCPTRSPPPDTPGSRGTTAMCSLASGATGGRGAVENEEDLPELS 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 
EHQFNIDSMVHKHGLEFYGYIKLINFIRLKNPTVEYMNSIYNPV 
PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGLSENTS 
VVBKLKHMEARATjSAEAALARAREDLQKMKQFAQDFVMHTDVRT 
CSSSTSVIADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 
FIYQNPHIFKDKWLDVGCGTGILSMFAAKAGAKKVLGVDQSEI 



505 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LYQAMDI IRLNKLEDTITLI KGKIEEVHLPVEKVDVT I SEWMGY 
FLLFESMLDS VLYAKNKYLAKGGS VY PD I CTISLVAVSDVNKHA 
DR IAFWDt)VYGFKMS CMKKAVI PEAWEVLD PKTL I S E P CG I kh 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RWFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPRS LTVTLTLNNS TQTYGLQ 


6447 


1554 


1068 


RLGPAEWHLSGPCHATLGARNRGRAIX5VRAAWRGAPLCQRVMMP 
SRTNLATGI PSSKVKYSRLSS TDDGYIDLQFKKTPPKI PYKAIA 
LATVLFLIGAFLI I IGSLLLSGYI SKGGADRAVPVLI IGILVFL 
PGFYHLRIAYYASKGYRGYS YDDI PDFDD 


6448 


74 


559 


GQVLSHC YHYRSS RWRRGGL S RGRGAG VMALVP YE ETTE FGLQ K 
FHKP LAT FS FANHT I Q I RQDWRHLG VAAVVWDAA I VLS T YLEMG 
AVEIjRGRSAVELX^GTGLVG I VAAIjLACR IRYERDl^FIiAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 


6449 


597 

• 


1876 


E YGVCENLRKLE I TGVS CRDVYAKLLHR YRH I LGLWQ PD I GPYG 
GLLNVWDGLFIIGWMYLPPHDPHVDDPMRFKPLFRIHLMERKA 
ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRTWLREEWGRTLEDI FHEHMQELI LMKFI YTSQYDNCLTYRRI 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITG 
DPNIPAGQQTVEIDLRHRIQLPDLENQRNFNELSRIVLEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAEQPAQCGQGQPFVLPVGVSSRNEDYPRTCRM 
CF YGTGL I AGHGFTS PERTPGVF I LFDEDRFGFVWLELKS FSLY 
SRVQATFRNADAPSPQAFDEMLKNIQSLTS 


6450 


848 . 


269 


FVPAPRTVSGKRSLPGEWEERGEGEQRTGREFSGNGGRAVEAAR 
MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 
VLGIiAGNSFRPEHRALLNAFTATFEbSDDGRFEVWNAMTRGQHC 
DTWSYVLIPAAQPGQFTVDHRVWTHEQAGRPQDQPAGQELVAAS 
RDAGPVHLPGQS SGPLG 


6451 


232 


939 


HS P T P P TS PRAS TMED VKLE FPS L PQ CKEDAEE WT YPMRREMQE 
ILPGLFLGPYSSAMKSKLPVLQKHGITHIICIRQNIEANFIKPN 
FQQLFRYLVLDIADNPVENIIRFFPMTKEFIDGSLQMGGKVLVH 
: GNAG I SRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPNAGF . 
VHQLQEYEAIYLAKLTIQMMSPLQIERSLSVHSGTTGSLKRTHE 
EEDDFGTMQVATAQNG 


6452 


1 


652 


RTRGESSNMEPLAAYPLKCSG PRAKVFAVLLS I VLCTVTLFLLQ 
LKFLKP KINS F YAFEVKDAKGRTVS LE K YKGKVSL WNVASDCQ 
LTDRNYLGLKELHKEFGPSHFSVLAFPCNQFGESEPRPSKEVES 
FARKNYGVTFP I FHKI KI LGSEGEPAFRFLVDSSKKE PRWNFWK 
YLVNPEGQWKFWRPEEP I EVI RPD IAALVRQVI I KKKEDL 


6453 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPIjQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


1 6454 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
D PGI YKC WCGT PL FKS ETKFDSGSG W P S FHDVINSEAI TFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHLATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLY IE I KRG VTEDDGRP I YALVNLATTS I SKMATDFAENELDLF 
RKALELI IDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKICNICHSL 
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SEQ 

NO: : 


Predicted 
oe ginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 

ctiniilC? c* L. J.CX 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Aianine / c=Cysteme, D=Aspartic Acid, E= = 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
nsnisciaine, i-isoieucme, K=iiysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop 
LOuon, /=possioie nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYWPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


RPQSRS ISMWRNSLLQVS SGLRWLRVCAMVDI LGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI P KLi V I VKQNG E VI TN KGR KQ I RE RG LAC FQD WVEAA 
DIFQNFSV 


6457 


23 

< • 


892 


PTTGFPVTNFPWNWPDGKPPIMILWSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VYT KQKMD P KE LGKYGVLF YNAC FM 1 1 PTL 1 1 S VSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGA I KNVSVAY I G I L I GGD Y I FS LLNFVGLNI CMAGGLRY , 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET- 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLF YNACFMI I PTLI I S VSTG 
DLQQATEFNQWKNWFILQFLLS CFLGFLLM YS TVLCS YYNSAL 
TTAWGAI KNVSVAY IG I L IGGDYI FS LLNFVGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


892 

* 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI I LSVFAI I LGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VYT KQ KMD P KE LG KYG VL FYNACFM 1 1 PTL I IS VSTG 
DLQQATEFNQ WKNWFILQ FLLSCFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVSVAY IG I L IGGDYI FS LLNFVGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 

• 
* 

• 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET * 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYIFVFLND : 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLIIS VSTG 
DLQQATEFNQWKNWFI LQ FLLSCFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVSVAYIG IL IGGDYI FSLLNFVGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 ; 

• 

* 
* 


1653 


360 


LQQRTLRITAVGQTHPIAWMAWEPSLGAFYGPASFITFVNCMYF ; 
LSIFIQLKiuiPERKYELKEPTEEQQRLAANENGEINHQDSMSLS : 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATS LS FSAFFWHHCVNREDVRLAW I MTCCPGRS S 
YSVQVNVQPPNSNGTNGEAPKCPNSSAESSCTNKSASSFKNSSQ 
GCKLTNLQAAAAQCHANSLPLNSTPQLDNSLTEHSMDNDI KMHV 

APLEVQFRTNVHSS RHHKNRS KGHRASRLTVLREYAYDVPTS VE 

ncMnKini ovcdt PMWi?r!UCDCDDavT avD^?DfT^^M^^o^^r^■neeT^7v^ , 
bo VyiNUJjJ^K^KJjVjrlNINnjonbKoKKAiljAl 

STLPKSSRNFEKPVSTTSKKDALRKPAVVELENQQKSYGLNLAI 
QNGP I KSNGQEG PLLGTDSTGNVRTGLWKHETTV 


6462 


-> 


/ / j 


o Ci £» LtUKCt iVKJ-iAJSi/b r*K A i irIY iUi o (jr Vi'oJLir VoijlollUliir'ivtlAfCil 
PDSOS MEES KLKNDDRKTPVNl^KDSRGTRVAVS S PMSOHOS YIO 
YLHAYP YPQMYD P SHPAYRAVS PVLMHS YPGAYLS PGFHYPVYG 
KMSGREETEKVNTSPSVNTKTTTESKALDLLQQHANQYRSKSPA 
PVEKATAEREREAERERDRHS PFGQRHLHTHHHTHVGMGYPL I P j 
GQYDP FQGLTSAALVASQQVAAQASASGMFPGQRRE 


6463 


2 


350 


VILCILGGWIFKNADRSMBKKKGEPRTRAEARPWVDEDLKDSSD : 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 1 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


GILRQKEREERNRIHKKEILFLEHLLWPSEMSSLSGKVQTVLG 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid i 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine # 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LVE PSKL GR TLTHEHLAMTFD C C YCPPPP CQEAI S KE P I VMKNL 
YWIQKNAYSHKENLQLNQETEAI KEELLYFKANGGGALVENTTT 
G1SRDTQTLKRIAEETGVHIISGAGPYVDATHSSETRAMSVEQL 
TDVLMNE I LHGADGTS I KCGI IGE I GCSWPLTESERKVLQATAH 
AQAQLGCPVIIHPGRSSRAPPQIIRILQEAGADISKTVMSHLDR 
TILDKKELLEFAQLGCYLEYDLPGTELLHYQLGPDIDMPDDNKR 
IRRVRLLVEEGCEDRILVAHDIHTKTRLMKYGGHGYSHILTNVV 
PKMLLRGITENVLDKILIENPKQWLTFK 


6465 


126 

• 


1396 


KMTVFFKTLRNPIWKKTTAGLCLLTWGGHWLYGKHCD^LLRRAAC 
QEAQVFGNQLI PPNAQVKKATVFLNPAACKGKARTLFEKNAAP I 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVI IVAGGDGTLQEV 
VTGVLRRTDEATFSKI P IGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAIVKGETVPLDVLQIKGEKEQPVFAMTGLRWGSFRDAGV 
KVSKYWYLEPLKIKAAHFFSTLKEWPQTHQASr SYTGPTERPPN 
E PEETPVQRPSLYRR I LRRLAS YWAQPQDALSQEVSPE WKDVQ 
LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRNPKLHVEGTECLQASQCTLLIPEGAGGSFSIDSEEYEAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
PFCNHEKSCDVKMDRARNTGVISCTVCLEEFQTPITYLSEPVDV 
YSDW IDACEAANQ 


6467 

* 


301 

* 

T 


2571 


GELRVLAIJ^GELACHAVLTASLLSLRSRLMDSDMDYERPNVET 

I KCVWGDNAVGKTRL I CARACNATLTQYQLLATHVPTVWAIDQ 

YRVCQEVJjERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV \ 

WLCFSIANPNSLHHVKTMWYPE I KHFCPRAPVILVGCQLDLRY 

ADLEAVNRARRPrARPIKPNEILPPEKGREVAKELGIPYYETSV 

VAQFGIKDVFDNAIRAALISRRHLQFWKSHLRNVQRPLLQAPFL 

PPKPPPPIIWPDPPSSSEECPAHLLEDPLCADVILVLQERVRI 

FAHKIYLSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 

SDQHHHflHHHHHGRDFJjLRAASFDVCESVDEAGGSGPAGLRAST 

SDGI LRGNGTGYIiPGRGRVLSSWSRAFVS IQEEMAEDPLTYKSR 

LMVVVKMDSSIQPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 

LEVFDLRMMVANILNNEAFMNQEITKAFHVRRTNRVKECLAKGT 

FSDVTFILDDGTISAHKPLLISSCDWMAAMFGGPFVESSTREW 

FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLI ILANRLCLPHL 

VALTEQYTVTGLMEATQMMVD IDGDVLVFLELAQFHCAYQLADW 

CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 

EDHYQRARKE RE KE DYLHLKRQ PKRRWLF WNS PS S P S S S AAS S S 

SPSSSSAW 


6468 


3 


1374 


DAWAGTNMAALAP VGS PAS RGPRLAAGLRLLPMLGLLQLLAEPG 
LGRVHHIJUiKDDVRHKVHLNTFGFFKDGYMVVNVSSLSLNEPED 
KDVT IGFS LDRTKNDGFSS YLDEDVNYC I LKKQS VS VTLLILD I 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSL YFHKCLGKELPSDKFT FS LD I EITEKNPDS YLSA 
GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 

TniTur Tflaiof.mrTlrtTTT CTMTnWTDMTtrTTlDDtTT.UVnfA VTTTT7C 

ITIAI/lGTGWAF IKJi±L»SDiUJKKIi , Ml v J. rKK VJjAIM VAX 11 lhij t 

TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 

GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTNMAALAPVGS PASRGPRIAAGIjRLLPMIiGLLQLLAEPG 
LGRVHHLALKI3DVRHKVHLNTFGFFKDGYMVVNVSSLSLNEPE0 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVXGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 

3m"i tin ari H 

residue of 
amino acid 
sequence 


Predicted end 
nucieociae 
location 
corresponding 
to first 
amino acid 

rcsiuuc OI 

amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=A±anme, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
3j=Leucine, M=Methionine, N=^Asparagine, 
P=Proline, Q=Glutamine / R=Arginine, 
i>=5erine, T^Tnreomne , v=valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GE I PLPKL YI SMAFFFFLSGTI WIHILRKRRNDVFKIHWLMAAL 

PFTKSLSLVFHAIDYHYISSQGFPIEGWAVVYYITHLLKGALLF 

ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIBS 

TEEGTTBYGLWKDSLFLVDLLCCGAILFPVWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 

j ■ • 


I 1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAPRS 
GPLPREDGCRTPGPQLLPtiPGALIiRPRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTES PS SAAGRPASMAEAEEDCHSDTVRADDDE 
ENES PAETDLQAQLQMFRAQWMFELAPG VS SSNLENRPCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWVVS 
SDLDLRSLEQLSLVCRGFYICARpPEIWRLACLKVWGRSCIiCLV 
'POTSWREMFLERPRVRFDGWISCTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEBPQSIVPRLRTR 


6471 

■ 


1750 

• 


299- 

"- 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 

GPRNKKRGWRRIAQEPLGLEVDQFLEDVRLQERTSGGLIiSEAPN 

EKLFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDLILENTSK 

VPAPKDVLAHQVPNAKKLRRKEQLWEKLAKQGELPREVRRAQAR 

LLNPSATRAKPGPQDTVERPFYDLWASDNPLDRPIiVGQDEFFLE 

QTKKKGVKRPARLHTKPSQAPAVEVAPAG7\SYNPSFEDHQTLLS ., 

AAHEVELQRQKEAEKLERQLiALPATEQAATQESTFQELCEGLIiE 

ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 

VHRLRVQQAAIiRAARLRHQELFRIiRG I KAQVALRLAELARRQRR 

RQARREAEADKPRRLGRLKYQAPD I DVQLSSELTDSIJRTLKPEG 

NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 ] 

• I 


897 

i 


S CGSDRAQWAME FPFD VDALF PERI TVLDQHLRPPARRPGTTTP 

ARVDLQQQ IMTI IDELGKAS AKAQNLSAP ITS ASRMQSNRHWY 

HiKDSSARPAGKGAIIGFIKVGYKKLFVIjDDREAHNEVEPLCIIi 

DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF. 

LNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA '. 

AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 

RAPRRATPPAHPPPRSSSLGNSPERGPLRPFVP 


6473 


22 

j 

\ ' \ 


9X2 


S SAVE FVWEGB KMAAEPNKTE 1 QTLF KRLRAVPTNKACFD CGAK . 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
JjKC-My VCjbNANATAFFRQHGCTANIiANT RQ 
LGSTUUjARHGTDLWIDNMSSAVPNHSPEKKDSDFFTEHTQPPAW 
DAPATEPSGTQQ PAPSTES SGIiAQPEHGPNTDLLGTS PKAS LEL 
KSSIIGKKKPAAAKKGLGAKKGLGAQKVSSQSFSEIERQAQVAE 


6474 


3 1 


462 


L.QRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETECAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLKQRIAEETIL 
KSQVDKRFSAHYDAVEAELKSSTVGLVTLNDMKARQEALVRERE 
RQLAKRQHLEEQRLQQERQREQEQRRERKRKISCLSFALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFI KEDLILPHYHTFYDFI IARARGK 
SGPLFSFDVHDDVRIiLSDATMEKDESHAGKWLRSWYEKNKHIF 
PASRWEAYDPEKKWDKYTIR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine f C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=G1 ut amine , R=Arginine, 
S=Serine, T^Thr eonlne , V=Valine, 
W=Tryptophan, Y^Tyrosme, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6477 


227 


915 


LQGHIjMGIMAASRPLSRFWEWGKNIVCVGRNYADHVREMRSAVL 
S EP VLFLKP S TAYAP EGS P ILM PAYTRNLHHELiELG WMG KRCR 
AVTEAAAMDYVGGYALCI^Di^TARDVQDECKKKGLPWTLAKS FTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
I IS YVSKI ITLEEGDI ILTGTP KGVGPVKENDEIEAGIHGliVSM 
TFKVEKPEY 


6478 

■■■ ■ 


2 

. . . 


1495 

■ '- '■ 


FVSSRILPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF 
I FMEVLGSGAFS EVFLVKQRLTGKLFALKCI KKS PAFRDS SLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDAS LVIQQVLSAVKYLHENGI VHRDLKPENLLYLTPE 
ENSKIMITDFGLSKMEQNGIMSTACGTPGYVAPEVLAQKPYSKA 
VDCWS IGVI TYI L3LCGYPPFYEETES KLFEKI KEGYYEFES PFW 
DDISESAKDFICHLLEKDPNERYTCEKALSHPWIDGNTALHRDI 
YPSVSLQIQKNFAKSKWRQATOAAAVVHHMRKLHMNLHSPGVRP 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSLAAGPCGCC 
SSCLNIGSKGKSSYCSEPTIiIiKKANKKQNFKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLFPVFDLS 
YF I VS ILYLKYEPGAVELSRRHPIAS WLC7\MLHCFGSYI LADLL 
LGE PLI DYFSNNS S I LLASAVWYL I FFCPIJDLF YKCVCFLPVKL 
I FVAMKE WRVRKI AVG I HHAHHHYHHG WFVM I ATGWVKGSGVA 
LMSNFEQIiLRGVWKPETNE I LHMSFPTKASLYGAI LFTLQQTRW 
LPVSKASL I F I FTLFMVS CKVFLTATHS H S S P FDALEGY I CPVL 
FGS ACGGDHHHDNHGGSHSGGGPGAQHS AMPAKS KEELS EGSRK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLSY 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFIi 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVI AGGTLAI P ILAFVAS FLLWPSALIRI YYWY 
. WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS I LMLHGFSAHKD 
MWLSVVKFLPKNIjHLVCVDMPGHEGTTRS SLDDLS IDGQVKRIH 
QFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLWLVCP 
AGLQ YSTDNQFVQRLKELQGSAAVEKI PLI PSTPEEMSEMLQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDKI KVPTQI I WGKQDQVLDVSGADMLAKS IANCQVELLENCG 
HS WMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


EPVSKVSQSRRKAGVPTANIEESQAVEAAMANVPWAEVCEKFQA 

ALALSRVELHKNPBKEPYICSKYSARALLEEVKALIjGPAPEDEDE 

RPEAEDGPGAGDHALGLPAE WEPEGPVAQRAVRLAVI E FHLGV 

NHIDTEELSAGEEHLVKCLRLLRRYRLSHDCISLCIQAQNNLGI 

LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFLPEE 

EKLTEQERSKRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 

KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 

FGQTGKI SATEDTPEAEGEVPEL YHQRKGEI ARCW I KYCLTLMQ 

NAQLSMQDNIGELDLDKQSELRALRKKELDEEESIRKKAVQFGT 

GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 

IDGYVTDHIEWQDHSALFKGLAFFETDMKKK^ 

PLTVDLNPQYYLLVNRQ I QFEI AHAYYDMMDLKVAIADRLRDPD 

SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVIjRPA 

MLAKFRVARLYGKI ITADPKKELENIiATSLEHYKFI VDYCEKHP 

EAAQEIEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAPLSANGREARAMEQRLAEFRAARKRAGLAA(2P 
PAASQGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQPQGSTSETPWNTAIPLPSCWDQSFLTNITFLKVLLW 
LVLLGLFVELEFGLAYFVLSLFYWMYVGTRGPEEKKEGEKSAYS 
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| SEQ 

1 ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sianal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X^UnJmown, *=Stop 
Codon, /=*possible nucleotide deletion, i 
\«possible nucleotide insertion) 








VFNPGCEAIQGTLTAEQLERELQLRPLAGR 


6484 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGESIGNCPPCQRLF 
MILV^KGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
DPI KIEEFLEQTLAPPRYPHLS PKYKESFDVGCNLFAKFSAYI K 
NTQKEANKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCSLLPKLNI I KVAAKKYRDFDI PAEFSGVW 
R YLHNAYAREE FTHTCP E DKE I ENTYANVAKQKS 


6485 

I 


6 

■ > . * 


1091 

r 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRSILEEDEE 
DEE P PRVLLYHEPRS FEVGMLWHKHKKYP FWPAVVKS VRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSLITDYRVRLGCGSFAGSFLBYYAADISYPVRKS IQ 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVEYIGKAKGAESHLRAILKSRKPSRWLQTFLSSSQYVT 
CVETYLEDEGQLDLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 

VLLPEAI I CAI SAGDEVDYKTAEEKYI KGPSLS YREKE I FDNQL 
LEERNRRRR . 


6486 


10 


581 


LVLQAGGAHLSPSRVTQGIYYMLAFSEMPKPPDYSELSDSLTLA 
GGTGRFSGPLHRAWRMMNFRQRMGWIGVGLYLLASAAAFYYVFE 
ISETYNRLALEHIQQHPEEPLEGTTWTHSLKAQLLSLPFWVWTV 
I FLVP YLQMFLFLYS CTRADPKTVGYC 1 1 P I CIjAVI CNRHQAFV 
KASNQISRLQLIDT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGCI FHRNI KGFMVQTGDPTGTGRGGNS I WGKKFEDEYSEYL 
KHNVRGWSMANNGPNTNGSQFF I TYGKQPHLDMKYTVFGKVI D 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 

■1 * 
1) * 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPLPGARGPSWPPSPRVP 
MEPPNLYPVKLYVYDLSKGLARRLSPIMLGKQLEGIWHTSXVVH 
KDEFFFGSGGISSCPPGGTLLGPPDSWDVGSTEVTEEIFLEYL 
SS LGESLFRGE AYNLFBHNCNTFS NE VAQFLTGRKI PS Y I TDLP 
SEVLSTPFGQALRPLLDSIQIQPPGGSSVGRPNGQS ' 


6489 


1457 

■ 


375 


KVAKMATALS E EE LDNED Y YS LLNVRREASS E EL KAAYRRLGML 
YHPDKHRDPELKSQAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWEWERRRTPAEIREEFERLQREREERRLQQRTNPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQS I EAPL 
TATDTAILSGSLSTQNGNGGGSINFALRRVTSAKGWGELEFGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRGIRPGLTTVLAR 
NLDKNTVGYLQWHCS SPLLQVQRPHRNTRACAPE PS FRP FLHVP 

TWDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAG CEVWLG YG P RAAAAAAAT VLFGGAGPTETMF VARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS VWRVTWAHPEFGQVLASCSFDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGI VRI YE 
APDVMNLSQWSLQHEISCiCLSCSCISWWPSSSRAHSPMIAVGSD 
DSSPNAMAKVQI FE YNENTRKYAKAE TLMTVT D PVHD I AFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNS QVWRVSWN I TGTVLAS SGDDGCVRLWKANYMDNWKCTG I L 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGRKHS 


6491 


3 


1183 


HEAGCE VWLGYG PRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS VWRVTWAHPE FGQVLAS CS FDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYE 
APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQI FE YNENTRKYAKAE TLMTVTD PVHD I AFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVS WNI TGTVLAS SGDDGCVRLWKANYMDNWKCTGI L 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline f Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGRKHS 


6492 

♦ 

■ 


34 

* 


2573 

.■ 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTEHGTPKPFRK 
FDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCEI 
KRQEVINELFYTERAHWTLKVLDQVFYQRVSREGILSPSELRK 
I FSNLEDILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDIIPTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTS S LKLSEYPN 
VEELRNI^LTKRKMIHEGPLWKVWRDKTIDIiYTLLLEDILVLL 
Q KQDDRLVLRCHS KI LAS TADS KHTFS P VI KLS TVLVRQVATDN 
KALFVI SMSDNGAQIYELVAQTVSEKTVWQDLI CRMAASVKEQS 
TKP I PLPQSTPGEGDNDEEDPSKLKEEQHGISVTGIiQSPDRDLG 
LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 
EKS VQEDWQHFPRYRTAS QGPQTDSVIQNSENI KAYHSGEGHMP 
FRTGTGDIATCYS PRTSTES FAPRDS VGLAPQDSQASNILVMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 
EEKDVNLRISGNYLILDGYDPVQESSTDEEVASSLTLQPMTGIP 
AVESTHQQQHS PQNTHSDGAI SPFTPEFLVQQRWGAME YSCFE I 
QSPSSCADSQSQIMEYIHKIEADLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGSSTSDCMSKTLDSASAHFAASAWSAPVPSRSEVA 
KEQNTGHNNINGWQPSGTSKTLYSTNMALSSS PG IS AVQLVRT 
VGHTTTNHLIPALCTSSPQTLPMNNSCLTNAVHLNNVSWSPVN 
VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


6494 

*- 


2425 

t 

i 

W 
t 


1052 

< 


AVAGGARPCSTPSS PHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVL I CRNYRGDVDMSE VEHFMP I LMEKEEEGMLS PILAHGG 
VRFMW I KHNNLYLVATSKKNACVSLVFS FLYKWQVFSEYFKEL 
EEES I RDNFVI I YELLDELMDFGYPQTTDS KILQEYI TQEGHKL 
ETGAPRP PATVTNAVS WRSEG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPDGEFELMSYRIiNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSE IVWSIKSFPGGKEYLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
P WVRY I TQNGDYQLRTQ 


6495 


2425 


1052 


AVAGGARPCSTPSS PHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVL I CRNYRGDVDMS EVEHFMPILMEKE EEGMLS P I LAHGG 
VRFMWI KHNNLYLVATSKKNACVSLVFS FLYKWQVFSEYFKEL 
EEESIRDNFVIIYBLLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRP PATVTNAVSWRSEG IKYRKNEVFLDVIESVNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLS RFENDRT I S F I P PDGE FE LMS YRLNTH VK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSE IVWSIKSFPGGKEYLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
PWVRYITQNGDYQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPBYSIHSLFCIMFLCAQEWLTLGLNVPLLFY 
HFWRYFHCPADSSELAYDPPVVMNADTLSYCQKEAWCKLAFYLL 
SFFYYLYCMIYTLVSS 


6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGESGV 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKELY 
DHAEATIWMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALD S TNVE LAF E TVLKE I FAKVS KQRQNS I RTNAI TLGSAQAGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AaAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, FePhenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leuc ine , M=Me t hionine , N^Aspar agine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








BPGPGEKRACCISL 


6498 

* 


2636 


272 


S LRLCPWGTHLiAGPTTMRLS S LLALLRPALP L I LGLS LGCSLS L 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFBCPRI 
VP YYRDPNKPYKKVLRTRYI QTELGS RERLLVAVLTS RATLSTL 
AVAVNRTVAHHFPRLLYFTGQRGARAPAGMQWSHGDERPAWLM 
SETIJlHLHTHFGADYDWFFIMQDDTYVQAPRLAAIiAGHLiSINQD 
LYLGRAEEFIGAGEQARYCHGGFGYLLSRSLLLRLRPHLDGCRG 
DILSARPDEWLGRCLIDSLGVGCVSQHQGQQYRSFEIiAKNRDPB 
KEGSSAFLSAFAVHPVSEGTLMYRLHKRFSALEIiERAYSEIEQL 
QAQIRNLTVLTPEGEAGLSWPVGLPAPFTPHSRFEVLGWDYFTE 
QHTFS CADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLSRV^ILPMPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAWLAVRAEAPSQVRIiMDWS KKHPVDTLFFLTTVWTRPG 
PE VLNRCRMNAI SGWQAFFP VHFQEFNPALS PQRS PPGPPGAGP 
DP PS P PGAD PS RGAP IGGRFDRQAS AEGCFYNAD YLAARARLAG 
ELAGQ2EEEALEGLEVMDVFIiRFSGLHLFRAVEPGLVQKFSLRD 
CS PRIiS EEL YHRCRLSNLEGLGGRAQLAMALFEQEQANS T 


6499 

• 


3 

** 

h 
ft 


2040 

• 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGGYVIiSLVHDVRFHHFPIERQLNGTYAIAG 
GKAHCGPAELCEFYSRDPDGIjPCNLRKPCNRPSGLEPQPGVFDC 
LRDAMVRDYVRQTWKLEGEALEQAI I S QAPQ VE KI» I ATTAHERM 
PWYHS SLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALS LI YG 
KTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYC 
LKEACPNS S ASNASGAAAPTLPAH P STLTHPQRR I DTLNS DG YT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LIAD I ELGCGNFGS VRQGVYRMRKKQI DVAI KVLKQGTEKADTE 
EMMREAQ IMHQLDNP YXVRLI GVCQAEALMLVMEMAGGGPliHKF 
LVGKREE I PVSNVMLIiHQVSMGMKYLEEKNFVHRDLAARNVLL 
VNRHYAKISDFGLSKALGADDSYYTARSAGKV^LKWYAPECINF 
RKFSSRSDVWSYGVTMWEALSYGQKPYKKMKGPEVMAFIEQGKR 
MECPPECPPELYALMSDCWIYKWEDRPDFLTVEQRMRACYYSLA 
SKVEGPPGSTQKAEAACA 


6500 


1773 

• 


726 

• 


TGPTHAS ADAWGLVRS VTE WCANVRGNP CAAALS C PQAVLDAGK 
MLSESSSFLKGVMLGSIFCALITMLGHIRIGHGNRMHHHEHHffi 
QAPNKED I LKI SEDERMELSKS FRVYCI I LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFESIN^TNDMWLMMRKAYKYAFDK 
YRDQ YNWFFLARPTTFAI I ENLKYFLLKKDPSQPFYLGHTI KSG 
DLE YVGM EGGI VLS VESMKRLNS LLN I P E KCPEQGGM I WK I S ED 
KQLAVC L KYAG VFAENAEDAD G KDVFNT KS VGLS I KEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


6501 


1 


570 


LVGMSGGGTETPVGCEAAPGGGSKKRDSLGTAGSAHLI IKDLGE 
XHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTLPKCRDTMRDSLSQVLQRLQAANDSVCRLQQREQERKKI 
HSDHLVASEKQHMLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEMEKDLAKFSTF 


6502 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KSSEALE FMKRDLTE FTQ WQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGEISBLLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ. 
EQARRDALKQRAEQS I S EEPGWEEEEEELMG I S PI S P KEAKVP V 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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ID 
NO: 


beginning 
nucleotide 

1 or*a h i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rlCUXLLCU GI1U 

nucleotide 
location 

noi^v^HTiririfi i no 

to first 
amino acid 
residue of 
amino acid 
sequence 


Ami no ari rl Rpcrmpnt" c on hai n i nn oipmal r*&T"»f- A At* 

(A=Alanine, C= Cysteine, D=Aspartic Acid, £» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine 1=1 soleuc i np K=Lvsine» 

L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RsArginine , 
S =Ser ine , T=Threonine , V=Val ine , 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSmGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 

• * » 


1650 

, r 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 

VPCT7AT T7TMVT3T^T Tf UfTmnTnUfiTA f »l' T A TVT'A C\ ATIOT 1 ITT ATUPO 
KJboliiyjlirMlU<JJijlrjr iy vVyHjJi AL. J. ' A A i A^MK KK 1 .A 1 h,tt.S 

SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYIX5TKARLYSLQSDPATYCNBPDGPPELFDAWLSQFCLEEK 
KGE I SELLVGSPS I RALYTKMVPAAVSHSEFWHRYFYKVHQLEQ 
EQARRDALKQRAEQS I S E E PG WE E EE EELMG I S P I S P KEAKVP V 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 

T W <3 V D T ,T P 21 nUTHT! PF P P PP&PVFTT ,P FF Z\ P TDT .RVFFT ,TJ<3 n«in 

KSTPSNWGKKGSSTDISEDWEKDFDIiDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCLVAHWVCLSILS P P PAGMKT PNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNI VGCRI SHGWKEGDEP ITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVASSHISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 

REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFHIYVYDLVKKS 


6505 


2131 


1294 


GKVCLVAHWVCLS I LS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCR I SHGWKEGDEP ITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVASSHISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
RE PGG WDGLIGKHVE YTKEDGSKRI GMVIHQVEAKPSVYF I KF 
DDDFHIYVYDLVKKS 


. 6506 


1 

4 • * 


1350 

*• 


EVS P PTS CCLTVAVADPGVS EGFRGFGAGCEMPGRGRCPDCGST 
EIjVBDSnxSQSQLVCSDCGCVvTEGVLTTTro YfaK. 
S1X3ENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSGIRAARLQKKEVLVGCCVLITCRQHNWPLTMGAI CTLLYAD 
LDVFSSTYMQIVKLLGU)VPSLCLAELVKTYCSSFKLFQASPSV 
PAKYVEDKEKMLSRTMQLVELA^TWLVTGRHPLPVITAATFLA 
WQSLQPADPXSCSIiARFCKIiANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLPJjDKRS VVKHIGDLLQHRQS LVRSAFRDGTAEVET 
REKEPPGWGQGQGEGEVGWNSLGIiPQGKRPASPAIiIiLPPCMLKS 
P K"R T PP VP P VQTVPOf) FM T HSF T FO YT /R T POF VRD FOR AO A AR 

QAATSVPNPP 


6507 


1878 


929 


RSHASRLPELPSGCLVLQVQELVQMSGMEATVT I P I WQNKPHGA 
ARS WRR I GTNLPLKP CARAS FETLPN I S DLCLRDVP P VPTLAD 
I AM I AADE E FTY ARVR S DTR PLRHTWKPS P LI VMORNAS VPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTPD FLS PGSSNVSS PLP CFGSSFHSTTS FVI SD I TEETEVE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRSI^KEEDPAVLISEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQVHVWDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
I LQLVGDAVHPQFKE I QKLI KEPAPDSGLLGLFQGQNSLLH 


6509 


2 


1053 


FVWNPRGGRKRRRQAAVTQAATRASGTPS PRDGTMTQGKLSVAN 
KAPGTEGQQQVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 
APTAVPLHP S WAYVDP S S S S S YDNG FPTGDHEL FTTFS WDDQKV 
RRVFVRKVYTILLIQLLVTLAWALFTFCDPVKDYVQANPGWYW 
ASYAVFFATYLTLACCSGPRRHFPWNLILLTVFTLSMAYLTGML 
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SEQ 
ID 
NO: 

• • 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containinq sicrnal oeDtide 
(A»Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine f T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSYYNTTSVLLCLGITALVCLSVTVFSFQTKFDFTSCQGVLFVL 
LMTLFFSGLILAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
IiLMGNRRHSLSPEEYI FGALNI YLDI IYIFTFFLQLFGTNRE 


6510 


37 


1156 


PCALDGCPQRGAVHPLLSSAMGLIAFLKTQFVLHLI.VGFVFVVS 
GLVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLBWW 
SCTECTLFTDQATVERFGKEHAVIILNHNFEIDFLCGWTMCERF 

gvlgsskvlakkell YVPLIGWTWYFLE ivfckrkweedrdtw 
eglrrlsdypeymwfllycegtrftetkhrvsmevaaakglpvl 
kyhllprtkgfttavkciirgtvaavydvtlnfrgnknps llgi l 

YGKKYEADMCVRRFPLEDIPLDEKEAAQWLHKLYQEKDALiQEIY 
NQKGMFPGEQ FKPARRPWTLLNFLS WAT ILLS PLFS FVLG VFAS 
GS PLLI LTFLGFVGAGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWAS FPS PLPG PAPLKGGK ; 
TMATNFSD I VKQGYVKMKSRKLGI YRRCWLVFRKSSSKGPQRLB 
KYPDEKSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAI IFTDD 
SARTFTCDSELEAEEWYXTLSVECLGSRLNDISLGEPDIiUVPGV 
QCEQTDRFNVFLLPCPNLDVYGECXLQITHENIYLWDIHNPRVK 
LVS WPLCSLRR YGRDATRFTFEAGRMCDAGEGL YTFQTQEGEQ I 
YQRVHSATIiAIAEQHKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHH1TGSQNIAEASSYAGEGYGAAQASSETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 


6512 

■ 

i 


159 


807 


FGKKSTWFPLSRSLRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSS IWRKEPRMVCTRKTK 
TLVSTCVILSGMTNIICLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKI IERLDHLENVIKQHIQEAPAKPEEAEAEPFTDSS • 
LFAHWGQELSPEGRRYALKQFQYYGYNAYLSDRLPLDRP 


6513 

* 

i 

* 

* 


2 

* 


756 

■ 


FVS PE PGF S LAQLNL I WQ LTDTKQLVHS FAEGQDQGSA YANR TA 
LFPDLLAQGNASLRLORVRVADEGSFTCFVS IRDFGSAAVSIiOV 
AAP YSKPSMTLEPNKDLRPGDTVTI TCSS YQG YPEAEVFWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHSILRWLGANGTYSCLVRNPV *- 
LQQDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSF ; 
SPE PGFSLAQLNLI WQLTDTKQLVHS FAEGQDQGS AYANRTALF J 
PDLLAQGJJAS LRLQRVRVADEGS FTCFVS IRDFGS AAVSLQVAA - 
PYS KPSMTLEPNKDLRPGDTVT ITCSS YQGYPEAEVFWQDGQGV 
PLTGNVTTSQMANEQGLFDVHSII^WLGANGTYSCLVRNPVLQ :' 
QDAHSS VTITPQRS PTGAVEVQVPEDPWALVGTDATLRCS FSP 
EPGFSLAQIiNLl WQLTDTRQLVHS FTEGR 


6514 


985 


302 

• 


VGI PGPTI SSAAEMEDLLDLDEELR YSLATSRAKMGRRAQQESA 
QAENHLNGKWSSLTLTGETSSAKLPRCRQGGWAGDSVKASKFRR 
KASEEIEDFRLRPQSLNGSDYGGDIPI I PDLEEVQEEDFVLQVA 
APPSIQI KRVMT YRDLBNDLMKYSAI QTLDGE I DLKLLTKVLAP 
EHEVRERNPSWQDDVGWDWDHLFTBVSSEVLTBWDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCXSAGSTQLEVSASASCGALGSADMNP I W 
VHGGGAGP I SKDRKERVHQGMVRAATVG YGILREGGS AVDAVEG 
AWALEDDPEFNAGCGSVLNTOGEVEMDASIMDGKDLSAGAVSA 
VQCIANP I KLARLVMEKTPHC FLTDQGAAQFAAAMGVPE I PGE K 
LVTERNKKRLEKEKHEKGAQKTDCQKNLGTVGAVALDCKGNVAY 
ATS TGGI VNKMVGRVGDS PCLGAGGYADNDIGAVSTTGHGES I L 
KVNLARLTIiFHIEQGKTVEEAADLSLGYMKSRVKGLGGLIVVSK 
TGD WVAKWTS TSM P WAAAKDGKLHFG I D P DDTT I TDL P 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMP YSTNKELI LGIMVGTAGISLLLLWYHKVR 
KPGIAMKLPEFLSLGNTFNSITIiQDEIHDDQGTTVIFQERQLQI 
LEKLNELLTNMEELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A=Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FPVPKAFNTRVEELNLDVLLQKVDHLRMS ESGKS ES FELLRDHK 
EKFRDEIEFMWRFARAYGDMYELSTNTQEKKHYANIGKTLSERA 
INRAPMNGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
IKLLPEEPFLYYLKGRYCYWSKLSWIEKKMAATLFGKIPSSTV 
QEALHNFLKAEELCPGYSNPNYMYLAKCYTDLEBNQNALKFOTL 
ALLLPTVTKEDKEAQKEMQKIMTSLKR 


6517 


3 


1414 


GRVWGGS S S LNAM VYVRGHAED YERW QRQGARG WDYAHCL P Y FR 
KAQGHELGASRYRGADGPLRVSRGKTNHPIiHCAFLEATQQAGYP 
LTEDMNGFQQEGFGWMDMTIHEGKRWSAACAYLHPALSRTNLKA 
EAETLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 
S PQLLMLSGIGNADDLKKLG IPWCHLPGTVGQNLQDHLE IYI QQ 
ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLETGGFIR 
SQPGVPHPDIQFHFLPSQVIDHGRVPTQQEAYQVHVGPMRGTSV 
GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKLTREIFAQE 
ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAVVDPQTRVLGVENLRVVDASIMPSMVSGNLNAPTIMIA 
EKAADI I KGQPALWDKDVPVYKPRTLATQR 


6518 


242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPPPPSTMGDAGSERSKAPSLPPRCPCGFWGSSKTMN 
LCSKCFADFQKKQPDDDSAPSTSNSQSDLFSEETTSDNNNTSIT 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSLITPTKRSC 
GTDSQSENEASPVKRPRLLENTERSEETSRSKQKSRRRCFQCQT 
KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KMVKLDRKVGRSCQRIGEGCS 


6519 


3 

*- 

* 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AXKVRTEEKXAPRRVNGEGGSGGNSRQLQPPAAPS PQSYGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKI KERDKEKEREKKKHK 
VMNE I KKENGEVKI LLKSGKEKPKTN IEDLQI KKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKI PENSEFPFVSLKEPRVQNNLKRLDTLBFKQLI 
HIEHQPNGGASVIHCLQ 


6520 .• 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLS AAPSPS SSRSSFS FSAGTAVPSSASASLS QPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKS RRPKEKRE KERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKI KERDKEKEREKKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQI^KKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKI PENS EFP FVS LKEPRVQNNLKRLDTLEFKQL I 
H IEHQPNGGASVIHCLQ 


6521 


184 

• 


1798 


KLFKMATDTSQGELVHPKALPLIVGAQLIHADKLGEKVEDSTMP 
IRRTVNS TRETP P KSKLAEGEEEKP EPD I SSEES VSTVEEQENE 
TPPATSSEAEQPKGEPENEEKEENKSSEETKKDEKDQSKEKEKK 
VKKTIPSWATLSASQLARAQKQTP^5ASSPRPKMDAILTEAIKAC 
FQKSGASWAIRKYIIHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLLKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 
AIAAMNEPKTCSTTALKKYVLENHPGTNSNYQMHLLKKTLQKCE 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 
PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 


6522 1042 


391 | NKWLRPSPRSHRTPESGRVLSLFRIjPPPGMAIjSGSTPAPCWEED 
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SEQ 
ID 
NO: 


Predicted 
becrinnincr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

^o^ysceine, JJ=Aspart.ic Ac id, £•= 
Glutamic Acid, F=*Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 

P — Drnl i no O— fZ 1 11 h»n^ no D-Arnim'riA 
* — ^ ic / y— vjiutouiiuc f K-iuyinillE, 

S=Serine, T=Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








ECLDYYGMLSLHRMFEWGGQLTECELEIiLAFLLDEAPGAAGGIj 
SRARSGLKLLLELEIUlGQCDESNLRLLGQLLR\n^ARHDLLPHLA 
RKRRRPVS PERYS YGTSSSS KRTEGSCRRRRQSS SSANSQQGS P 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 




XU27 / 


A2» CV/i KKK i AAJjU b (jit K± ACtKRS P I AbAMASNFND I VKQG YVKI 
RSRKLGI FRRCWLVFKKASS KG PRRLEKPPDEKAAYPRNFHKVT 
EliHNIKNITRLPRETKKHAVAI IFHDETSKTFACESELEAEEWC 
KHLCMECLGTRIiNBISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQITHENI YLWD IHNAKVKIiVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 

W m0 • 

■ 


2 


7 0Q7 
J. yj j / 


>wv-UltlKKiA/iJ^ooriKJLAVjKKoJr lAJjAMAoNfc NDI VKQGYVKI 
RSRKXjGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI IFHDETSKTFACESELEAEEWC 
KHLCMECLGTRLOTISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTS LTE PMTLS KS I SLPRSAYWHHI TRQNS VGE 
I YSLQGNHENRHSDLTGKSCKTS ENRFLEENAPLVMYG I THHLF 
MDTSTCKWHDLE 


6525 

« 


1 

* 
• 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS 
PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KES KSGLVKPGSEADFS S SSSTGS I SAPEVHMSTAGS KRS S SSR 
NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQYLTPLQQKEVTVRHLKTKLKESERRLHERESEIVELKSQL^ 
MRfcjDWIEEECIIRVEAQLA^ 

KGIQKYFV15INIQNKKLESLIiQSMEMAHSGSIjRDELCLDFPCDS . 
PE KS LTLNP PLDTMADGLS LEEQVTGEGADRELLVGDS I ANSTD 
LFDEIVTATTTESGDLELVHSTPGANVT.ETiT.PIVMGQEEGSVVV 
ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 
MES FPESLSALWDLTPRNPNSAI LLS PVETPYANVDAEVHANR 
LMRELDFAACVEERLDGVI PLARGGWRQYWSSS FLVDLLAVAA 
PWPTVLWAFSTQRGGTDPVYNIGALLRGCCVVALHSLRRTAFR 
IKT 


6526 

• 


2 


2034 


SGRAGEPEEWRGRQI IDS KETWI PFNSEDSQQLEEAYSSGKGCN 
GRWPTDGGRYDraLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVPYSESFSQVXjEETYMIAOTLDEWKKKLESPNREIIILHNP 
KIjMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVEN I S VDI HCGE P 
I^IDHLVTVVHGIGPACDLRFRSIVQC^^FRSVSLNIjLQTHFK i 
KAQENQQIGRVEFLPVNWHSPLHSTGVDVDLQRITLPSINRLRH 

ftndtildvffynsptycqtivdtvasemnriytlflqrnpdfk 

TPTLEEDLKKLQLSEFFDIFEKEKVDKEAIiALCTDRDtiQEIGIP 
LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGDYLDVGIGQVSVKYPRLI YKPE 1 FFAFGSP IGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEETEAEPESTSEKPSDVNTEETSVAVKEEVLPINV 
GMLNGGQRIDYVIjQEKPIESFNEYLFALQSHLCYWESEDTvIjLV 
LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMKCQRGDLS 
FIFNGDAAPSESFWLDNEQKVYQRIHHEESEMETEEBVDILMS 
SDIYSATLSTKSISFTRAQTGWLFREDKTERVGNFLADFYLVNG 



517 



WO 01/53312 



PCT7US00/34263 



j SEQ 
ID 
NO: 

- 


Predicted 

beginning 

nucleotide 

location 

c orr e sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=«Histidine, I»Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VWaline, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVLESRKRREHLSEEDILRNKAIMESLSKGGNIMEQNFEPIRRQ 
SLTPPPQNTITWEEYISAENGKAPHLGRELVCKESKKTFKATIA 
MSQEFPLGIELLIiNVLBWAPFKHFNKLREFVQMKLPPGFPVKL 
DIPVFPTITATVTFQEFRYDEFDGSIFTIPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKILFCVLGLYIA 
IPFLIKLCPGIQAKLIFLNFVRVPYPIDLKKPQDQGLNHTCNYY 
LQPEEDVTIGVWHTVPAVWWKNAQGKDQMWYEDALASSHP I ILY 
LHGNAGTRGGDHRVELYKVTiSSLGYHWTFDYRGWGDSVGTPSE 
RGMT YDALHVFDW I KARSGDNP VYI WGHSIX3TGVATNLVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSG I KFANDENVKH IS CPLL ILHAEDDPVVPFQLGRKLYS I AA 
PARS FRDFKVQFVPFHS DLG YRHKYI YKS PELPR I LREFLGKSE 
PEHQH 


6529 


363 

■ 

t 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF 
EGVEDES FLKWFCGNVNEQNVLSEREIjEAFSI LQKSGKP ILEGA 
ALDEALKTCKTSDLKTPRLDDKELEKIiEDEVQTLLKLKNLKIQR 
RNKCQLMAS VTSHKSLRLNAKEEEATKKLKQSQG 1 LNAM I TKIS 
NELQALTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPSICD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESLHSLTSKAVDKENLDAKISSLTSEIMKLEKBVTQIKDRSLPA 
VVRENAQLLNMPVVKGDFDLQIAKQDYYTARQELVLNQL I KQKA 
SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDPSVSQQINPRNTIDTKDYSTHRLYQVLEGENKKKELFLTHGN 
LEEVAEKLKQNISLVQDQLAVSAQEHSFFIiSKRNKDVDMLCDTL 
YQGGNQLLLSDQELTEQFHKV^SQLNKLNHLLTDILADVKTKRK 
TLANNKLHQMEREFYVYFLKDEDYLKDIVENLBTQSKIKAVSLE 
D. 


6530 

• 


128 

• 


2986 

i 


GAAHHGAI VQ VHP LLPGS S T I M I HDLCL VF PAPAKAWYVS D I Q 
ELYIRVVDKVBIGKTVKAYWVLDLHKKPFLAKYFPFMDLKLRA 
ASPI ITLVALDEALDNYTITFLIRGVAIGQTSLTASVTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FSISNESVALVSAAGLiVQGLAIGNGTVSGliVQAVDAETGKWI I 
SQDLVQVEVLLLRAVRIRAPIMRMRTGTQMPIYVTGITNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDIiRGRHHE AS IRLP S Q YNFAMNV 
LGRVKGRTGLRAVVKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNP E I EAEQ ILMS PNS Y I KLQTNRDGAAS LS YRVLDGPE KVP V 
VHVDEKGFLASGSMIGTSTIEVIAQEPFGANQTI I VAVKVS PVS 
YLRVSMS P VLHTQNKEALVAVPIiGMTVTF'TVHFHDNSGDVFHAH 
S SVLNFATNRDDFVQ IGKGPTNNTCVVRTVSVGLTLLRVWDAKH 
PGLSDFMPLPVLQAI S PELSGAMWGDVLCLATVLTSLEGLSGT 
WSSSANS ILHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEVW 
S VPQR I MARHLHP IQTSFQEATAS KV I VAVGDRS SNLRGECTPT 
QREVIQALHPETLISCQSQFKPAVFDFPSQDVFTVEPQFDTALG 
QYFCS I TMHRLTDKQRKHLSMKKTALWS ASLSS S HFSTEQVGA 
EVPFS PGLFADQAE I LLSNHYTS SEIRVFGAPEVLENLEVKSGS 
PAVLAFAKEKSFGWPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAIAIPVTVAFVVDRRGPGPYGASLFQHFLDSYQVMFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
S PTSPNALP PARKAS P PSGLWS PAYASH 


6531 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCT1SSS 
S LCMVI TI YYDVKVRFJVRGCGQ Y I S YRCQEKRNT YFAE YWYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPIiAIiPLSDSQIQWFYQAL 
NLS IjPL PNFHAGTE pdgldp mvtls LNLGLS FAELRRM yl FLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 
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SEQ 
ID 
NO: 

■ 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid * 
sequence 


Predicted end 
nucleot ide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine, C=Cvsteine D=Anrv3T-i- i r An-ir? t?— 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon. /»"DOSsible nucleotide deletion 
\»possible nucleotide insertion) 






■ 


ITDSTGTHLVLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA " 

QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 

QMDDIiFDILIQSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 

PS P S AELPQAAP P P PG S P S L PGRLED FLES STGLPLLTS GHDGP 

EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 

DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLFSTDFLDGHD 

IiQLHWDSCL 


6533 

* 
♦ 


1798 

* . 


373 


STISWIiARVEPPRRSSGVGAARLRFPGGSRPLRARACVIJUiAVL 
ALLERNNADSMS AHSMLCER I AIAKEL rKRAES LS RS RKGG I EG 

AENLEEWSVLHVFGYTDTLGEKQTLWDWANGGHTWVKAIGR 
KAEALHNI WLGRGQYGDKS 1 1 EQAEDFLQASHQQPVQ YSNPHI I 
FAFYNS VSS PMAE KLKEMG I SVRGD I VAVNALLDHPE ELQPSES 
ESDDEGPELLQVTRVDRENILiASVAFPTEIKVDVCKRVNLDITT ' 
LITYVSALSYGGCHFIFKEKVIjTEOAEOERKEOVIjPOLEAFMTCD 

KELFACESAVKDFQSILDTLGGPGERERATVLIKRII^WPDQPS 
ERALRLVASSKINSRSLTIFGTGDTLKAITMTANSGFVRAANNQ 
GVKFSVFIHQPRALTESKEALATPLPKDYTTDSEH 


6534 


47 


596 


KATRF I SAAFWLNKQGVS PAKLPHTS WS WSLQTLS FLFSGDLA 

EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYS YGATPYIiFWMERTVPlRAPT TiT 1 VrT,K"PMT? vaQ Q 

LEKKEKEDESFQLLLGSRYNVLKAHCLLPLIRWLTSGDSLLSAQ 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTLAFLEIDKAFSSHARLS 
ADATLLTSGTTATVALLPJ)GI EL WASVGDSRAILCRKGKPMKL 
TIDOTPERKDEKERIKKCGGFVA^SLGQPHVNGRLAMTRSIGD 
LDLKTSGVIAEPETKRIKLHHADDSFLVLTTDG INFMVNSQEI W 
DFVNQCHDPNEAAHAVTEQAI Q YGTEDNSTAVWPFGAWGKYKN 
SEINFSFSRSFASSGRWA 


6536 

• 
« 


242 


1174 


SliVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENSETWTGS 
LDDLVKVWKWRDERLDLQWSLEGHQLGVVSVDISHTLPIAASSS 
LDAHIRLWDLENGKQI KS I DAGPVDAWTLAFS PDSQYLATGTHV 
GKVNIFGVESGKKEYSLDTRGKFILSIAYSPDGKYLASGAIDGI 
INI FD IATGKLLHTLEGHAMP IRS LTFS PDSQLLVTASDDGYIK 
IYDVQHANLAGTLSGHASWVLNVAFCPDDTHFVSSSSDKSVKW 
DVGTRTCVHTFFDHQDQVWGVKYNGNGS KI VS VGDDQE IHI YDC 
PI 


6537 


1S38 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLLDRV 
FTTYKLMHTHQTVDFVRSKKAQFGGFSYKKMTVMEAVDLLDGLV 
DESDPDVDFPNSFHAFQTAEGIRKAHPDKDWFHLVGLLHDLGKV 
LALFGEPQWAWGDTFPVGCRPQASWFCDSTFQDNPDLQDPRY 
STEIX3MYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLSPQSTCTR 


6538 


3345 


2412 

* 


P YLYDFLDAL I TCQTAPEEAF I KLDGLAGMLTEQLRRLTKOVGE 
ARHNRDDEAI KKAVNE YDETMEKYI PVLMAQAKI YWNLENYPMV 
EKI FRKSVEFCNDOTVWKLNVAHVLFMQI2NKYKEAIGFYEP ITV^ 
KHYDNIlxWSAIVI*ANLC7SYIMTSQNEKAEELMRKIEKEEEQI» 
SYDDPNRKMYHLCIVNLVIGTLYCAKGNYEFGISRVIKSLEPYN 
KKLGTDTWYYAKRCFLSIiLZ5NMS KHMIVIHDS VIQECVQFLGHC 
ELYGTNIPAVIEQPLEEERMHVGKNTVTDESRQLKALIYEIIGW 
NK 


6539 


218 


339 


FJ^AASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 


6540 


3 


391 


LERLWLLLLRRPEDAMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
KQHAWLPLTIEIKDRLQLRVLLRREDWLGRPMTPTQIGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYHIKIDGVEDMLLELLPDD 
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SEQ 
ID 
NO: 


Predxcted 
becrinnina 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

^ W AMMV \mf mm* 

amino acid 
sequence 


Amino acid segment containing signal peptide 

l cui-Liic , L-Lyateine, U-ASpal. LlC Aula, 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine T=Thraonine» \7—\7a 1 inp 

W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=poasible nucleotide insertion) 


6541 


1165 


536 


RTLVQRRILMLLRKPARGRDLRGRGRGTPRGGRKGLLPTPDEPP 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 

>- 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHALYLAFLARKEGT 
KRGFLSKKTAEASRWHEKWFALYQNVLFYFEGEQSCRPAGMYLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREVLMQKYIHLVQIVET 
EKI AANQLRHQLEDQDTEIERLKSE I IALNKTKERMRP YQSNQE 
DEDPDI KKI KKVQS FMRGWLCRRKWKTI VQDYI CS PHAESMRKR 
NQIVPTMVEAESEYVHQLYILWGFLRPLRMAASSKKPPISHDD 
VSSIFLNSETIMFLHEIFHQGLKARIANWPTLILADLFDILLPM 
LNT YQEFVRNHQYS LQVXiANCKQNRDFDKLLKQ YEANPACEGRM 
LETFLTYPMFQIPRYI ITLHELLAHTPHEHVERKSLEFAKSKLE 
ELSRVMHDEVSDTENIRKNLAIERMIVEGCDILLDTSQTFIRQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTGGVLS L IDCTL I EE PDASDDDS KGSGQVFGHLDF 
KIVVEPPDRAAFTVVLLAPSRQEKAAWMSDISQCVDNIRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YASVERLLERLTDLRFLS IDFLNTFLHTYRI FTTAAWLGKLSD 
I YKRPFTS I PVRSLELFFATSQNNRGEHLVDGKS PRLCRKFSSP 
PPLAVSRTSSPVRARKLSLTSPLNSKIGALDLTTSSSPTTTTQS 
PAASPPPHTGQIPLDLSRGIjSSPEQSPGTVEENVDNPRVDLCNK 
LKRS IQKAVLES APADRAGVESS PAADTTELS PCRS PSTPRHLR 
x KU f WiU i AUNAHCS VS> PASAFAI ATAAAGHG S P PGFNNTERTC 
DKEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKLEDI IQMTDC 
MKAECFESLSAMELAEQITLLDHVIFRSIPYEEFLGQGWMKLDK 
NERTPYIMKTSQHFNDMSNLVASQIMNYADVSSRANAIEKWVAV 
ADI CRCLHNYNGVLE I TS ALNRSA1 YRLKKTWAKVS KQTKALMD 
KLQ KTVS S iEGRFKNLRETLKNCNP PAVP YLGMYLTDLAFI EEGT 

LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 


950 

* 


FVSGCGRAGIGLSWAMAAEARVSRWYFGGIASCGAACCTHPLDL 
LKVHLQTQQEVKLRMTGMALRVVRTDG I LALYSGLS ASLCRQMT 
YSLTRFAIYETVRDRVAKGSQGPLPFHEKYIiLGSVSGLAGGFVG 

SGATMASSRGALVTVGQLS CYDQAKQLVLSTGYLSDNI FTHFVA 
S FIAGGCATFLCQPLDVLKTRLMNS KGE YQGVFHCAVETAKLGP 
IiAFYKGLVPAGI RLIPHTVLTFVFLEOLRKMFGI KVPS 


6544 


630 


79 


PSPCFIRSRIJX3QPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWIjQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR j 
GPQESPQKMSEEVRAEPQEEEEEECEGKEEKEEGEMAPLPEAHLG , 
EGKQKECP 


6545 


176 


560 


P PHSHAALLPAAMTPLLTL I LWLMGL P LAQALD CHVCAYNGDN 
CFNPMRCPAMVAYCMTTRTYYTPrRMKVSKSCVPRCFETVYlX5Y 
SKHASTTSCCQYDLCNGTGIiATPATLAIiAPILIiATLWGLL 


6546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS 
SPGVLKVLAQLGLGFSCANKAEME LVQHI GI PASKI I CANPCKQ 
IAQIKYAAKHGIQLLS FDNEMELAKWKSHPSAKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHHVEWGVSFHIGSGCPD 
PQAYAQS I ADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVR F 
EE I AS VINS ALDLYFPEGCGVD I FAELGRY YVTSAFTVAVS 1 1 A 
KKEVLLDQPGREEENGSTSKTIVYHLDEGVYGIFNSVLFDNICP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponamg 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








TPII^KKPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDW 
LVFDNMGAYTVGMGS PFWGTQACH ITYAMSRVAWEALRRQLMAA 
EQEDD VEGVCKPLS CGWEITDTLCVGPVFTPASIM 


6547 
• 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPIiIiLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEEAAAVWESLQQEARQAPRPNNLHTLCGA 
P VHVRERGTGSETNQETLRATAPALPMAPAP PLLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
I KGCGITFTLGKGTEVGELKI LSRFQNA 


6549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQEEIIQARKHKLIKMC 
SSVAAKLWFLTDRRIREDYPQKEILRALKAKCCEEELDFRAVVM 
DEWLTIEQGNLGLRINGELITAYPQWWRVPTPWVQSbSDIT 
VLRHLEKMGOUiMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 
YGGHENFAKMIDEAEVLEFPIWVKNTRGHRGKAVFLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMMCSLSEQGKQIiAIQVSNILGMDVCGIDLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PESTERELLTKLPGGLFNMNQLLANEIKLLVD 


6550 


2293 

• 

T 

♦ [ 


922 

■ 

* 


FRVSRDGAPDCGIEQMGLAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLIQFLI ILGLVLFMVYGNVHVSTESNLQATERRAEGLYSQ 
LLGLTASQSNLTKEIiNFTTRAKDAIMQMWLNARRDLDRINASFR 
QCQGDRVI YTNNQRYMAAI I LSEKQCRDQFKDMNKSCDALL FML 
NQKVKTIaEVE I AKEKT I CTKDKES VLLNKRVAEEQLVECVKTRE 
LQHQERQIAKEQLQKVQALCLPLDKDKFEMDLRNLWRDSIIPRS 
LDNLG YNLYHPLGSELAS IRRACDHMPS LMS S KVEELARSLRAD 
IERVARENSDLQRQKLEAQQGLRASQEAKQKVEKEAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 
. RNSAIiDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI < 
LESQRPPAGI PVAPSSG 


6551 ; 

/■ 


157 


748 

* 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFIiADPLNKSSYKYE ' 
ADTVDLNWCVI S DMEVI ELNKCTSGQS FEVI LKPPS FDGVPEFN ■ 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER : 
EVIQKAIEENNNFIKMAKEKLAQKMESNKENREAHLAAMIiERLQ . 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE • 

ADTVDLNWCVISDMEVIBLNKCTSGQSFEVILKPPSFDGVPEFN 

ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHliAEKREHER 

EVIQKAIEENNNFIKMAKEKIAQKMESNKENREAHLAAMLERLQ 

EKDKHAEEVRKNKELKEEASR 


6553 

• 


2 


1807 


FVWS KMAAHLS YGRVNLNVLREAVRREIiREFLDKCAGSKAI VWD 
EYLTGPFGLIAQYSUiKEHEVEKMFTLKGNRLPAADVKNII FFV 
RPRLELMDI IAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
LGVLGSFIHREBYSLDLIPFDGDLLSMESEGAFKECYLEGDQTS 
L YHAAKGLMTLQ AL YGTI PQ I FG KGE CARQ VANMM IRMKRE FTG 
o\Jvio if rv cUJNijJjJbljDRNVDLLTPIjATQLTYEGLI I YG IQNS 
YVKIiPPEKFAPKKQGDGGKDLPTEAKKLQIiNSAEELYAEIRDKN 
FNAVGSVLSKKAKI ISAAFEERHNAKTVGE I KQFVSQLPHMQAA 
RGSLANHTSIAELIKDVTTSEDFFDKLTVEQEFMSGIDTDKVNN 
YIEDCIAQKHSLIKVLRLVCLQSVCNSGLKQKVLDYYKREIIiQT 
YGYEHILTLHNLEKAGLLKPQTGGRNNYPTIRKTLRLWMDDVNE 
QNPTD I S YVYSGYAPLS VRLAQLLSRPGWRS I EEVLR I L PGPHF 
EERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAALRFLSQL 
EDGGTE YVIATTKLMNGTSW I EALMEKPF 



521 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(AaManine rnfVflhpirip T") — A ci"r»a i^t" i 2ir«ir1 Tr- 
io™ ^VJ»«*lJ.i*W | w H vjr O UCXliC j L/— no UCtJL tlv nLlUf i-J — 

Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6554 


119 


1244 


PPMnQnV9\7T?Qfi2iT,Tr\n7T\7f2rtniirr!<^iTazi2icoT nm mmcMT \rnM 
r v v *3 v aounJJllV v x vVsVSwCVsvjX/innoUuwAJjM V ir r rlJ-ivJJiYI 

KDS FHHNVAALRAS VETGFAKKT F I S YS VT FKDNFRQGL WGID 

LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 

AYEDMVRQVQRSRFIVWGGGSAGVEMAAEI KTEYPEKEVTLIH 

SQVALADKELLPSVRQEVKEILLRKGVQLLLSERVSNLEELPLN 

E YREYIKVQTDKGTE VATNLVILCTG I KINS SAYRKAFESRLAS 

SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 

ANIVNSVKQRPLQAYKPGALTFLLSMGRNIX3VGQISGFYVGRLM 


6555 


1552 

* 
* 


498 


IHMALIjRKINQVLLFLLIVTLCVILYKKVHKGTVPKNDADDESE 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQPLNFVRFYLPLLIHQHEFCVIYIoDDDVIVQGDIQELYDTTLA 
LGHAAAFSDDCDLPSAQD INRLVGLQNTYMG YLD YRKKAI KDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SLGGGVATS PML I VFHGKYSTINPLWH I RHLGWN PDAR YS EHFL 
QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIILPSLQSPPTFGFLLDIDGVLVRGHRVI 

CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGLGFR 
NVVTVDELRMAFPLI^MVDLEFJ^ 

EPVRWETSLQLIMDVLLSNGS PGAGLAT PPYPHLPVLASNMDLL 
WMAEAKMPRFGHGTFLLCLET I YQKVTGKELRYEGIjMGKPS I LT 
YQYAEDLIRRQAERRGWAAP I RKLYAVGDNPMSDVYGANLFHQ Y 
LQKATHDGAPELGAGGTRQQQPSASQS C I SI LVCTGVYNPRNPQ ; 
STEPVLGGGEPPFHGHRDLCFSPGLMEASHWNDVNEAVQLVFR 
KEGWALE 


— £557 

» 


2598 


•i J jt; 

*■ 


KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
SKLQFOTTNCRSDTVMEKRSFKIHPL^ 

QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLIiT 
MAGIFDCWEPPEGGDVLYSYTIITVDSCKGLSDIHHRMPAILDG 
EEAVSKWLDFGEVSTQEALKLIHPTENITFHAVSSWNNSRNNT 
PECLAPVDLWKKELRASGSSQRMLQWLATKSPKKEDSKTPQKE 
ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
; YSQ 


6558 


21 


113 8 

*L JL *J Q 


VASCDAAVAQCFLAENDWEMERALNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLSERARGVCS YLALYS PDV1 FLOEVT PP YYQ YT ,K KP «? q 

NYEIITGHEEGYFTAIMLKKSRVKLKSQEI I PFPSTKMMRNLLC 
VTIVNVSGNELCLMTSHLESTRGHAAERMNQLKMVLKKMQEAPES 
ATVIFAGDTNLRDREVTRCGGLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLGITAACKLRFDRIFFRAAAEEGHI I PRSLDLLGLEKL 
DCGRFPSDHWGLLCNLDIIL 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 


TATSGG I WLRRKWRCHWPRPLPQS CVGTEGGLQVRDTSSRI AKG 
GVDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTS PTRQHHVEREKDHSSSRPSS PRPQKAS PNGS I S SAGNS 
SRNSSQSSSDGSCKTAGEMVFVYENAKEGARNIRTSERVTLIVD 
NTRF WDPS I FTAQPNTMLGRMFGSGPJEHNFTRPNEKGEYEVAE 
GIGSTVFRAILDYYKTGIIRCPDGISIPELREACDYLCISFEYS 
TIKCRDLSALMHELSNDGARRQFEFYtiEEMILPLMVASAQSGER 
ECHIWLTDDDWDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
1 residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D-Aspartic Acid, S= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLG IEGYPTYKEKVKKRPGGRPE VI YN 
YVQRP FI RMS WEKEEGKSRHVD FQCVKS KS ITNLAAAAAD I PQD 
QLWMHPTPQVDELDI LP IHPP SGNSDLDPDAQNPML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLS&EPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRLQTTLIiDVTKSES 
I KAAAQWVRDKVGEQGLWALVNNAGVGLPSGPNBWLTKDDFVKV 
INVNLVGLIEVTLHMLPMVKRARGRWNMSSSGGRVAVIGGGYC 
VSKFGVEAFSDS I RRELYYFGVKVCIIEPGNYRTAILGKENLES 
RMRKLWERLPQETRDS YGEDYFR I YTDKLKNIMQ VAE PRVRDVI 
NSMEHAI VS RS PR I R YNPGLDAKLLY I PLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


1 


1562 

* 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTEPCVFSGGWL 
WRFKARHG I KKLDASSEKQSADHQAAEQFCAFFRSIiAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLPVAYKAQGNAWVDKEIFS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTI FLPAS VAS LVQPMEQG I RRDFMRNF INPPVPLQGPHAR 
YNMNDAIFSVACAWNAVPSHVFRRAWRKLWPSVAFAEGSSSEEE 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWOVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
. QAAVAFDAVLRFAERQ PC FS AQEVGQLRALRAVFRSQQQVRRRR ' 
GALGAWKVEALQEGPGGCGATAQS PLPCSSTAGDN 


6563 

■ 

• 


1319 

• * 


2694 


LARPAQP VLLRE PEGAGP P VPAGHLVHHIiQGGHLRERAHPDLEA 
HEHPLP CDQMFWRQMGGHLRMVEANS RG WWGIGYDHTAWVYTG 
GYGGGCFQGLASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 
LPTDRYMWS DASGLQECTKAGTKPPS LQWAWVSDWFVDFS VPGG 
TDQEGWQYASDFPASYHGSKltoKDFVRRRCWARKCKLVTSGPWL 
EVPPIALRDVS 1 1 PESPGAEGSGHS IALWAVSDKGDVLCRLGVS 
ELNPAGSSWLHVGTDQPFAS IS IGACYQVWAVARDGSAFYRGS V 
YPSQPAGDCWYHIPSPPRQRLKQVSAGQTSVYALDENGNLWYRQ 
GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVrAJTKVQGSHSLS 
RGTVCHRTGVQ PHE PKGHGWDYG I GGGWDH I S VRANATRAP RS S ■ 
SQEQEPSAPPEAHGPVCC 


6564 : 

• 


1 


975 


APGSCALWSYCGRGWSRAMRGCQLLGLRSSWPGDLLSARliLSQE 
KRAAETHFGFETVSEEEKGGKTOQVFESVAKKYDVMNDMMSLGI 
HRVWKDLLLWKMHPLPGTQLLDVAGGTGDIAFRFLNYVQSQHQR . 
KQKRQLRAQQNLS WEE IAKE YQNEEDSLGGS RVWCD INKEMLK 
VGKQKAIiAQGYRAGLAWVLGDAEELPFDDDKFD I YTIAFG IRNV 
THIDQAIiQEAHRVLKPGGRFLCLEFSQVNNPIiISRLYDLYSFQV 
IPVLGEVIAGDWKSYQYLVESIRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGR YI SLI LAVQIAYLVQAVRAAGKCD 
AVFKGFSDCLLKLGDSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCGSGNGAAGS 
LLPAFPVLLVSIiS AALATWLS F 


6566 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 
HVPPGCSQGLNPLYYNLCDRSGAWGIVLEAVAGAGIVTTFVLTI 
I LVASLPFVQDTKKRSLLGTQVFFLLGTIiGLFCLVFACVEKPDF 
STCASRRFLFGVLFA I CFS CLAAHVFALNFLARKNHGPRGWVI F 
TVALLLTLVEVI INTEWLI ITLVRGSGEGGPQGNSSAGWAVASP 
CAIAIWDFVMALIYVMLLLLGAFIiGAWPALCGRYKRWRKHGVFV 
LLTTATSVAIWVWIVMYTYGNKQHNSPTWDDPTLAIALAANAW 
AFVLFYVIPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
QSMFVENKAFSMDEPVAAKRPVS PYSGYNGQLLTSVYQPTEMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








MHKVPSEGAYD 1 1 LPRATANS QVMGSANSTLRAEDMYSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKRSNLKAYACSIHHIRTMSYVFVNDSSQTNVPLLQACIDGDFN 
YSKRLLESGFDPNIRDSRGRTGLHLAAARGNVDI CQLLHKFGAD 
LLATDYQGNTALHLCGHVDTIQFLVSNGLKID I CNHQGATPLVL 
AKRRGVNKDVIRLLESLEEQEVKG FNRGTHS KLETMQTAE SESA 
MESHSLLNPNLQQGEGVIjSS FRTTWQEFVEDLGFWRVLLL I FVI 
AIiLSLGIAYYVSGVLPFVENQPELVH 


6568 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTELFHPTLASISSPM 
LEGAELYFNVDHGYLEGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTDYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LEPLSTFLTYMTCSYMIDNVILLMNGALQKKSVKEILGKCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSENAIiDELN 
IELLRNKLYKS YliEAFYKFCKNHGDVTAEVMCPILEFEADRRAF 
IITLNSFGTELSKEDRETLYPTFGKLYPEGLRLIiAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVF YERE VQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKINSYIPIL 


6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMS WLFPLTKS AS S SAAGS PGGLTSLQQQKQRLI ESLRNSHS S I 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTSPLVNNFTMHSDLGKIIQSLLDEFWKNPPVLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVS S STTSHTTAKPAAP S FGVLSNLPLP I PTVDAS I PTS QNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVLLEQFLTLPQLK 
QIITDKDDLVKSIEEIiARKNLLLEPSIiEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARLKVAAHEAEEESDNIAED 
FLEGKME IDDFLS SFMEKRTI CHCRRAKEEKLQQAI AMHSQFHA 
PL 


6570 


330 

i 
\ 


1304 


ARLPRLT FLREG FL YVLLSHWVFVGAPR P PAS D S W KKGii VP SAP 
PASRKMGS KALPAPI PLHPSLQLTNYS FLQAVNT FPATVDHLQG 
LYGLSAVQrMHMNHWTLGYPNVHE ITRST ITEMAAAQGLVDARF 
P FPAL PFTTKLFHPKQGA I AHVLPALHKDRPRFDFANLAVAATQ 
EDPPKMGDLSKLSPGLGSPISGLSKLTPDRKPSRGRLPSKTKKE 
FICKFCGRHFTKSYNLLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHRYIHSKEKPFKCQECGKGFCQSRTLAVHKTLHMQTSSPTAA 
SSAAKCSGETVI CGGT 


6571 


169 


656 

» 


APDMNRKKLQKLTDTLTKNCKHLFRGFDKDNDGCVNVLEWIHGL 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
SEEDPDEGI KDLVEI TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPD PKSQME FEAQVFKDPNE FNDM 


6572 


49 


1646 


TPERAQPGALLGAAGCCVCGGRWWPRSHERGYFSSAKMGSKRRN 
LSCSERHQKLVDENYCKKLHVQALKNVNSQIRNQMVQNENDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENS I ELRELEKKLKAAYMNKERAA 
Q IAEKDAI KYEQMKRDAEIAKTMMEEHKR 1 1 KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
EFANMQQQREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDRIELMNAQKQRMKQLE 
HRRAVEKIiIEERRQQFLADKQRELEEWQLQQRRQGFINAIIEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQSFRAQDGTRTPATDCLMYLQGPRKLMTQGGYDMVQK 
LFLDFFRRRLSQRPTAE E LEQRN I LKPRNEQE EQEEKRE I KRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid. P=PVipnvl ?0 pni tip flsTtl vKi np 

H=Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *eStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTAADKVS RQEC WR VfifiRTVC WV Q T £1 P T & V 


6574 


204 


1159 


LE S S VP VS VG VFWACGVS WTGAAGLiQDGALS DTMARNAEKAMTA 
LARFRQAQLEEGKVKERRPFIiASECTELPJCABKWRRQIIGBISK 
KVAQIQNAGLGEFRIRDLNDEINKLLREKGHWEVRIKELGGPDY 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 

KWKAFRRARTiARfiETTPPFPPPPFPTNT YaVTPPP^nPF^Qn'K'Vri 

GDDSQQKFIAiWPVPSQQEIEEAliVRRKKMELLQKYASETLiQAQ 
SEEARRLLGY 


6575 


117 


820 


SPALASQSGGITEEKMLEPQENGVIDLPDYEHVEDETFPPFPPP 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKI^NIPKIiDAQRLI 
SERGLPAIiRHVFDKAKFKGKGHEARDLKMLIRHMEHWAHRLFPK 
LQFEDFIDRVEYLGSKKEVQTOiKRIRLDLPILHEDFVSNNDEV 
AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEQQQRIER 
NKQLALERRQAKLP ! 


6576 


1 


1060 

• 

. * 


RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WP VANCGVQE YNSNPKEHMTLRDYITYWKE YI QAGYS S PRGCL . 
YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDALDVDDYRFVY 
AGPAGSWSPFHAJJIFRSFSWSVNVCGRKKWLLFPPGQEEAIjRDR ' 
HGNLPYDVTSPALCDTHLHPRNQIiAGPPLEITQEAGEMVFVPSG , 
WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPLSQPELGWNGVAH 
G 


6577 

» 


2271 ; 

i 
'T 

V 


987 

* • 


SDRMA^DDFDIVIEAMLEAPYKKEEDEQQRKEVKKDYPSNTTSS 
XoiXawN&Jlowoo 1 JLV91S 1 oMKoKIJKl/Kx KKnJNoKoKoJruKQCKriR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKSPVREPVDNLS PE ERDARTVFCMQIiAARIRPRDLEDFFSAV 
GKVRDVRIISDRNSRRSKGIAYVEFCEIQSVPIiAIGLTGQRLLG 
VPI IVQASQAEKNFJjAAWAK^QKGNGGPMRLYV 

MIiRGIFEPFGKIDNiyiiMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNGFEIiAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKIiAEGAG IQL PSTAAAAAAAAAAQAAALQLNGAVPIiGA 
LNPAALTALS PALNLASQCLQLSSLFTPQTM 


6578 

i 


377 


1489 


PSflQATMKTRAPT.lfPATTT,TTMZi.T. f TY32iQnPQaT?liPZlTJrtT?1f DITT.T.'Da 

LQI AL WS LYW VTS I SMVFLN KYLLDS PS LRLDTP I FVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFiG 
MITFNNLCLKYVGVAFYNVGRSLTTWNVLI^YLLLKQTTS FYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLPAVDGSIWPXTFYNNVNACILFLPLLLLLGELQALRDF 
AOLGSAHFWGMMTLGGLFGFAIGYVTGLO I KFTS PLTHNVSGTA 
KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAY'TWVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT j 

IYMGKDKYENEDLIKHGWPEDIWFHVDKLSSAHVYLRLHKGENI 

EDIPKEVLMDCAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 

DVGQIGFHRQKDVKIVTVEKKVNEILNRLEKTKVERFPDLAAEK 

ECRDREERNEKKAQIQEMKKREKEEMKKKREMDELRSYSSLMKV 

ENMSSNQDGNDSDEFM 


6580 


62 


1571 


LVALKNWKPKGTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 
PREALSQLRVLCCEWLRPE IHTKEQILELLVLEQFLT ILPQBLQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KISSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDIISVIIANKPE 
ASLERQCVNLENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 

amino acid 

XCB1UUC sJ JL 

amino acid 
sequence 


jr j. cuicccu end 
nucleotide 
location 
corresponding 
to first 

alllJLIHJ atlU 

residue of 

. ami nni /S 
din JUi ci t — n_i 

sequence 


Amino acia facjyineriL. uuiicai n. x riy sxynctx peptiuc 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=5lsoleucine, K= Lysine, 
L=Leucine, M-Methionine , N=Asparagine, 

D-DrAl 4 np O—HlT nt*atriT no P A vrr 1 ni no 

tr — tr JL l_> X 1 ilC , v — v7XU.LcllllJ.Iltz: , Is.— Al y 111X116 , 

S=Serine, T=Threonine, V=Valine, 

U— r Py-\j-r\ I- ot^Vipti Y-Tuvftfliris Y -a Tin \c n ntum + — C f- or% 

n — X X yjM \» VkJUdli f X — lVi.Uai.UC ; A a LFUIVllUnll | -utUL; 

Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCX3KAFSHSSNLTL 
HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 
KAFSGKGSLIRHYRIHTGEKPYQCNECGKSFSQHAGLSSHQRLH 

Tf2PTf P W CY T?Pf3Tf A "RMW Q QMPUVTTPTR T WTVlTS'Tf P YW nTttPftTfT Pf* 
x 0 Hi i\. ir xxvwivCtwojuvr iNndk}nr in ixxiruxxm vjmN-f x w unnuai\ i r \_ 

SKSNLS KHQRVHTGEGEAP 


6581 


228 


476 


RVFLKDLSSTPMASNNTAS IAQARKLVEQLKMEANIDRI KVS KA 
AADLMAYCEAHAKEDPLLTPVPASENPFREKKFFCAIL 


ODD Z 




*7 1 Q 

/ JLo 


v_r 1 I Jxlliv-olr v 0 V r 1 JjorMj VJxKJ\i^lJx^oxjXJiai>JEiulJv v InlJorlHI 

QHPI IFWTLVWYFRI^LPSInxiPGLILTSEHCNEGVQLPLSSLS 
QDSKLVYIQLIiWDNlNLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVETIRQSIQHNNVLKPINLLSQQMKPGMKRQRSLYRE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERIiQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 

* 


RIFSMTSGRIjRWRCTWRPATALWSASLRLGTSSMHPSPRSISLP 
LS MMLS P LPSNTRGLS PTALFRS PDSEHATS C PRLHLWRCRAPL 
RSPSPLGRLQVLPRSPLHVHTHNSGKEVLGLQVQRSRSGTGPAC 
S QAGS GAVQGGNWC I F 


6584 


189 

■ • 


1750 

i 


PLPMAALGPSSQIWTEYVVRVPKNTTKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG 
I VLKE FRP EDQ P WLIiRVNGKS GRKFKG I KKGG VT ENTS YYIFTQ 
CPIXSAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVIJ^ 
FSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMS 
SDASDASGBEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ES D I D S EASS AFFMAKKKTP P KRERKPSGGS SRGNSR PGT P S AE 
GGSTSSTLRAAASKLEQGKRVSEMPAAKRLRLDTGPQSliSGKST 
PQPPSGKTTPNSGDVQVTEDATORYLTRKPMTTKDLIiKKFQTKK 
: TGLSSEQTVNVIAQILKRLNPERKMINDKMHFSLKE 


6585 


3 


1678 

* • 


GP I RNSRIDDFVGGDPRAEAS CS VLHS KPHAMAJDSRDPAS DQMQ 
HWKE QRAAQKAD VLTTG AGN P VG D KLNVI TVG P RG P LL VQ DWF 
TDEMAHFDRER I PER WHAKGAGAFG YFEVTHD ITKYS KAKVFE 
HlbKK JL FxAVKJ?©! VAWctbvjrbAJJ 1 VKUFKijJr AVKj: X icDUNnJJt) 
VGNOTPIFFIRDPILFPSFIHSQKRNPQTHLKDPDMVWDFWSLR 
PESLHQVSFLFSDRGI PDGHRHMKGYGSHTFKLVNANGEAVYCK 
FHYKTLK^IKI^SVEDAARIjSQEDPDYGIRDLFNAIATGKYPSW 
TFYI QVMTFNQAETFP FNP FDIiTKVWPHKDYPL I PVGKLVLNRN 
PVNYFAEVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDTHRHR 
t /^PNYT ,K T PVNTP YR AP V AW YPjP DnPMPMnDNOnnZVPMYYPW <3 P 

JJO.tr i\ X XHTXlr VA^V^Jtr ltvU\ VnW X yjC\XAJIrlM^l T jyiJXN^\JVJ.*4irL , » X X riioc 

GAPEQQPSALEHS IQYSGEVRRFNTANDDNVTQ VTUVFYWVLNE 
EQRKRLCENIAGHLKDAQ I F IQKKAVKNFTEVHPD YGSH IQALL 
DKYNAEKPKNAIHTFVQSGSHIiAAREICANL 


6586 


32 


804 


PLPEQPAESTSxWPVSGTPAPNKKRKSSKIjIMELTGGGQESSGL 
NLGKKI S VPRDVMLEELS LLTNRGS KM FKLRQMRVE KF I YENHP 
DVTSDSSMDHFQKFLPTVGGQLGTAGQGFSYSKSNGRGGSQAGG 
SGSAGQYGSDQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDGAGGEGKHI TVFKT Y I S PWERAMGVDPQQKMELG I DLLA 
YGAKAELPKYKSFNRTAMPYGGYEKASKRMTFQMPKV 


6587 


75 


1117 


RRVPSLGKMPECWDGEHDIETPYGLIxHWIRGSPKGNRPAILTY 
HDVGLNHKLCFNTFFNFEDMQE ITKHFWCHVDAPGQQVGASQF 
PQGYQFPSMEQIxAAMLPSWQHFGFKYVIGIGVGAGAYVLAKFA 
L I FPDLVEGLVLVNIDPNGKGW I DWAATKLSGLTS TLPDTVLSH 
LFSQEELvl^lNTELVQSx^QQIGiWWQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRCPViMLVVGDNAPAEDGVVECNSKLDPTTTT 
FLKiMADSGGLPQVTQPGKLTEAFKYFLQGMGYMPSASMTRIiARS 
RTASLTSASSViOGSRPQACTHSESSEGIjGQVKHTMEVS c 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=0nknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6588 


137 


501 


LGLQAQLLELRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQKALS KS KKAQEVEVLLS ENEMLQAKLHSQEEDFRLQNSTLMA 
BFSKLCSQMEQLEQENQQLKEGAAGAGVAQAGP 


6589 


2 


1405 

* 


RPWGSAMATFSRQEFFQQLLQGCLLPTAQQGLDQIWLLLAICLA 
CRLLWRLGLPSYLKHASWAGGFFSLYHFFQLHMWVVLLSLLC 
YLVL FL CRHS S HRGVFLSVTI L I YLLMGEMHMVDTVTWHKMRGA 
QMIVAMKAVSLGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
ISFHSYLQAVQGRPLSCRWLQKVARSLAIiALLCLVLSTCVGPYL 
FPYFIPLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFV 
G FLSEATATLAGAGFTEEKDHLEWDLTVS KPLNVELPRSMVEW 
TSWNLPMSYWLNNWFKNALRLGTFSAVLVTYAASALLHGFSFH 
LAAVLLS LAF I TYVEHVLRKRLAR I LS ACVLS KRCP PDCSHQHR 
LGLGVRALNLLFGALAI FHLAYLGS LFDVDVDDTTEEQGYGMAY 
TVHKWSELS WASHWVTFGCWI FYRLIG 


6590 


2177 

< 

■ 


656 


VRAYEHVLSLLENVFTPMFCHRDEYFRQLLRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIVVMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPS SERKEKKERI PVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLBSKLTEFHGAFPDAQLPSKRI1GPKNYEFLKSKREE 
FQEYLQKLLQHPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTR ILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVS LI TLLRDAI FCENTE PRSLQDKQKGAKQTFEEM 
MNYI PDLLVKC IGEETKYES I RLLFDGLQQPVLNKQLTYVLLD I 
VI QEL FPE LNKVQKE VTS VTS WM 


6591 

*- 

r 


2177 

t 


656 

* 


VRAYEHVLSLLENVFTPM FCHRDE YFRQLLRGAESPTRNS KLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEG I WME DDS PVEAVSTPNTPRNLAAWKI S IP Y 
VDFFEDPSSBRKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQBYLQKLI^HPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTR I LFKNTLEMYTDYYLQCKL 
EQLFQEHRLVS LI TLLRDAI FCENTE PRSLQDKQKGAKQTFEEM 
MNYI PDLLVKC IGEETKYES IRLLFDGLQQPVLNKQLTYVLLD I 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 

• 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG 
DLPQVERFFKIFPLLGLHEEGLRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVI FADTLTLLFEG IARI VETHQ P I VETYYGPGR 
LYTLI KYLQVECDRQVE KWDKFI KQRDYHQQFRHVQNNLMRNS 
TTEKI EPRELDP I LTEVTLMNARSELYLRFLKKRISSDFEVGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
TVNKAVALDTYEKGQLTSSMVDDVFYIVKKCIGRALSSSSIDCL 
CAMINLATTELESDFRDVLCNKLRMGFPATTFQDIQRGVTSAVN 
I MHSS LQQGKFDTKGIES TDEAKMS FLVTLNNVEVCSENI STLK 
KTLESDCTKLFSQGIGGEQAQAKFDSCLSDLAAVSNKFRDLLQE 
GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAEFKASLSPVIYDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 
LERVTE I LD YWGPNSGPLTWRLTPAEVRQVLALRI DFRSED I KR 
LRL 


6593 


3 


1837 


EAFSAGS RRRGLALQRGVLGGLGG YCP CCCRRRGRLLVLLLLVR 
RGGEGGGGRGRGDKRRRRQARRQRRRPE PAEARGGKMADVLS VL 
RQYNIQKKEIWKGDEVIFGEFSWPKNVKTNYWWGTGKEGQPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid . 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








E YYTLDS I LFLLNNVHLSHPVYVRRAATENI PWRRPDRKDLLG 
YLNGEAS TSAS IDRS APLE IGLQRSTQVKRAADEVLAEAKKPRI 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAKIMAKKRSTI KTDLDDD I TALKQRS FVDAEVD VTRD I VS RE 
RVWRTRTT I LQSTGKNFSKNI FAILQS VKAREEGRAPEQRPAPN 
AAP VDPTLRTKQP I PAAYNRYDQERFKGKEETEGFKI DTMGTYH 
GMTLKSVTEGASARKTQTPAAQPVPRPVSQARPPPNQKKGSRTP 
1 1 1 1 PAATTSLITMLNAKDLLQDLKFVPSDEKKKQGCQRENETL 
I QRRKDQMQPGGTAI S VTVP YRVVDQPLKLMPQDWDRWAVFVQ 
GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELS YH KRHLDRP VFLRVWE TLDR YMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
DILSTIGYDNIIQHLNNGRKNCKEFEDFLKERAAIEERYGKDLL 
NLSRKKPCGQSEINTLKRALEVFKQQVDNVAQCHIQLAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVSRSANLVNPKQQEKLFVKLATSKTAVEDSDK 
AYMLHIGTLDKVREEWQSEHI KACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
PPAP I MYENFYSSQKNAVPAGKATGPNLARRGPLP IPKSS PDDP 
NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDS DLGE DEG LL S LAG KRKRRGNLP KE S VK I LRDWL YLH 
RYNAYPSEQEKLSLSGQTNLSVLQICNWFINARRRLLPDMLRKD 
GKDPNQFTISRRGGKASDVALPRGSSPSVLAVSVPAPTNVLSLS 
VCSMPLHS GQGE KPAAP FPRGELE S P KPL VTPGS TLTLLTRAEA 
GSPTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMELQK 
QQDPSLPLLHTPIPLVSENPQ 


6596 


2 

T * 

t 


1026 

\ i 


PRLPTORYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLNAPWLKGQERADLSNSLEE I YIQNIGES I LYLWVEKI RD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLILACQPESSVKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNI YAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHLMEILNVKNVMVWSRWYGGILIGPDRFKHIN^ 
I LVE KNYTNS P EES SKALGKN KKVRKDKKRNEH 


6597 


2 

• 


1026 


PRLP VRRYHGRRRLQQRSRGHMAEGDAGSDQRQNEE I EAMAAI Y 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP : 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLILACQPESSVKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNI YAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMVWSRWYGG I LLGPDRFKHINNCARN 
I LVE KN YTNSPBESS KALGKNKKVRKDKKRNEH 


6598 


1099 

• 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQES PLFNNVKLQRKLPVES IQIVLEE 
LRKKGNLEWLDKSKSSFLIMWRRPEEWGKLIYQWVSRSGQNNSV 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAEIITVS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 


164 


1593 


KMAALTTLFKY I DENQDRYI KKLAKWVAIQS VSAW PEKRGE IRR 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
DPQKKTVCIYGHLDVQPAALEDGWDSEPFTLVERDGKLHGRGST 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
L I FARKDTF FKD VD YVCI SDN YWLGKKKP C I TYGL RG I C YFF I E 
VECSNKDLHSGVYGGSVHEAMTDLILLMGSLVDKRGNILIPGIN 
E AVAAVTE E EHKLYDD I DFD I EE FAKDVGAQ I LLHS HKKD I LMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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SBQ 
ID 
WO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted pnd 

nucleotide 
location 
corresoondinor 
to first 
amino acid 
residue of 
amino acid 
sequence 


2i m i no api H eocimPTi J* fori hai ninrf e ■? m"»ia 1 v\ov\V ^ 
jmiixtiu civ«.lu £>cyiuciiv» ^.unLaiuniy ox^ilal pcptlQc 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine I=Isoleucine K-Lvsine 

L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WGEQVTSYLTKKFAELRSPNEFKVYMGHGGKPWVSDFSHPHYL 

ZVfIRR JXMTfT\7 , T7CVVPnT.TRFr3f5 < ?TDVTr , T.TT7nT?aTY^JrKnnV!T T DX/r'C 
rv. r\-r\i ' j in. j. v it o V H c LfU i *\.J_iVJVJO JL r V i Jj 1 C \J.Cii\ L \j t\iN Vl*lJ_iJ_iirV^o 

ADDGAHS QNEKLNRYNYI EGTKMLAAYLYEVSQLKD 


6600 


2 


934 


PGRLFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQIVLRAP 
AAIiSALCDLLASAADPQIRQFAAVLTRRRLNTRWRRLAAEQRES 
LKSLILTALQRETEHCVSLSLAQLSATIFRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGIiLLLSWVTSRPEAFQPHHRELLRIjLNET 
LGEVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMQ 

iJJXrl L^.c^-U\-M^l^/\1 )BfixjUJaxjx-u!iO ISVr'vJ.ixri Lioc* v Jj X r L.Jj.e» VAK 

NVALGNAI R IR I LCCLTFLVKVKS KALLKNRLIATLiAAHPFPHC 
GC 


6601 


529 


1420 


PRAAARAPPPAVLRRDRRAATAPGAGEMTLHGPLAQRYFLNHIE 
rtx x i. vv^jJriviVfii v uNy tr JjiMfirJiNi_irnrrt too A Jew jryiCoi*lr»VoyJrINxjVi v l 
NHQHQQQ^4APSTLSQQNHPTQNPPAGLMSMPNALTTQQQQQQKL 
RLQRIQMERERIRMRQEELMRQEAALCRQLPMEAETLAPVQAAV 
NPPTMTPDMRS ITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV . 
PTTPEDFLSNVDEMDTGENAGQTPMNINPQQTRFPDFLDCLPGT 

xm v uu\j x i ipi j iZiX/xj x, trxjc nu v ciOnLtr(AOctJ> c xjx nxi 


6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSIxREVI KAMTKARNF 
ERVLG KITLVS AAPG KVI C EMKVE E EHTNAI GTLHGGLTATLVD 
NI STMALLCTERGAPGVS VDMNITYMS PAKLGED I VITAHVLKQ 
GKTIAFTSVDLTNKATGKLIAQGRHTKHLGN 


6603 


79 


660 


P VG PS SLAARTGLGHL P FLHRLAS SRGLDMDLLQ FLAFL F VLLL 

HQQYKILDVNLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFSHWENTAFFGDWLRFPRIVHYYFDHNSNVJNLLIRWGISFC 
NQTGVFNQGPHSPILSLM 


6604 . 

* 


3 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERVVLLGEFL 
HP CE DD I VCKCTTDENKVP Y FNAP VYLENKEQ I G KVDEI FGQLR 

GPPRGGGRGGRGGGRGGGGRGGGRGGG FRGGRGGGGGGFRGGRG 
GGFRGRGH. 


6605 


7 


848 


SGSRRGAMRAAGVGLVDCHCMLSAPDFDRDLDDVLEKAKKANVV 
ALVAVAEHSGE FEKI MQLS ERYNGFVLP CLGVHP VQGLPPEDQR 
SVTLKDLDVALPIIENYKDRLLAIGEVGIjDFSPRFAGTGEQKEE : 
QRQVLIRQIQlxAKRIiNLPVlWHSRSAGRPTINLLQEQGAEiCVLL 
HAFDGRPSV7VMEGVRAGYFFS IPPSI IRSGQQKLVKQLPLTS I C 
LETDSPALGPEKQVRNEPWNISISAEYIAQVKGISVEEVIEVTT 
QNALKLFPKLRHLLQK 


6606 


2 


1682 


FVE IRPRAEVANLS AHSAS PI QDAVLKRLS LLED I V YRQLNGLS 
KSLGLIEGYGGRGKGGLPATLSPAEEEKAKGPHEKYGYNSYLSE 
KISLDRSIPDYRPTKCKELKYSKDLPQISIIFIFVNEALSVILR 
S VHS AVNHTPTHT ,T . K" P T T T .VDHNTQ TO? TT F T .TCVPT . P* R YVWKT? VPfiT . 

VKWRNQ KREGL I RAR I EG W KVATGQ VTG FFDAHVE FTAGWAE P 
VLSR IQENRKRVI LPS I DNI KQDNFEVQRYENSAHGYS WELWCM 
YISPPKDtWDAGDPSLPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENlELGIKVWLCGGSMEVXiPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEVWMDDYKSHVYIAWNLPLENPGIDIGDVSER 
RAl^KSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQGPLEIWTAILYPCHGWGPQIjARYTKEGFLHIXAI^TTTIjLP 
DTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWTIKNSIK 


6607 


137 


986 


VPACAGLKKEARSLLAS PPRLLNTKLQASCRALFSPP IQSRQTT 
GISFQGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
Ps=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y»Tyrosine, X=Unknovm, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGVVNEDTFKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKFE 
DFVTALSILLRGTVHEKLRWTFNLYDINKDGYINQEEMMDIVKA 
IYDMMGKYTYPVLKEDTPRQHVDVFFQKMBKNKDGIVTLDEFLB 
SCQEDDN I MRSLQLFQNVM 


6608 


224 


1140 


RPCFSSPTGIiCPRLSYPMILIiQHAVLPPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGEEEIiSKGGEQDCALEELCKPLY 
CKLCNVTLNSAQQAQAHYQGKNHGKKLRNYYAANS CP PPARMSN 
WEPAATPWPVPPQMGSFKPGGRVI LATENDYCKLCDASFSSP 
AVAQAHYQGKNHAKRLRLAEAQSNS FSESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
SMCNVGAGEEMEFRQHLESKQHKSKVSEQRYRNEMENLGYV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLS PAEPRGPLASPVRAAPRAPC PAAEMSELNTKTS 
PATNQAAGQEEKGKAGNVKKAEEEEEIDIDLTAPETEKAALAIQ 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLHI FIRFPLTYPDMYMGMMCTAKKCGIRFQPPAI ILI 
YESEIKGKIRQRIMPVRNFSKFSDCTRAAEQLKNNPRHKSYIjEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKIi 
DDKEIiAKRKS IMDELFBKNQKKKDDPNFVYDI EVEFPQDDQLQS 
CGWDTESADEF 


6611 


978 


212 


pgcsgagsrvwwlpalrhlamgstessegrrvsfgvdeeervrv 
uksvrlse^wmrmkepsspppaptsstfglqdgljiiraphkes^ 

LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRLARELESREAELRRRDTFYK 
EQLBRIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
QAQILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 

• 

• 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAKR IQKEbAE ITLDPP PNCSAGPKGDNI YEWRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVICLDILKDNWSPALTISKVLLSICSLLTDCNPADPLVGSI 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVEDTYRQVIS CDKSICTLQ I TDTTGSHQFPAMQRLS I SKGHA 
FILVYS I TSRQS LEELKP I YEQI CEI KGDVES I P IMLVGNKCDE 
S PSRE VQ S S EAEALARTW KCAFMETS AKLNHNVKELFQELLNLE 
KRRTVSLQIDGKKSKQQKRKEKLKGKCVIM 


6614 


3 


1191 


SSAAEAMRVLVRRCWGPPIAHGARRGRPSPQWRAIiARIiGWEDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEEtilDKLE 
WTMPS P S PKGLPVKQYAVQSQIiPVYEWPDVGSGEYDVGWASF 
GRLLNEALILKFPYGILNVHPSCLPRW 

GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEEQTS 
EQIFRLYRAIGNIIPLQTLWMANTIKLLDLVEVNSSVLADPKLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NGYLHPWYQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRLPELQVYTRGKKYQRLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACLVHRRRRREDQMDGDGPRPREAFWEPTS SDEGGAASDDSM 
TDLYP PELFTRKDLGSTEDGDGTDDFLTDKEDEKAKP PREKATD 
EGRRETTVYRGLVQKRGKKQLGSLKKKFKSHHRKPKSFSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLS TPS ICRS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=i>ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q «Glut amine , R«Arginine, 
SaSerine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






■ 


PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPIWLQPSPPPQSSP 
P PQPHP CHTCRGLVDS FNKGLERTIRDNFGGGNTAWEEENLSKY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEELVESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHLVCSACF 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
IPESAGFFSEMTEDELWLQQMFFGII ICALATLAAKGDIiVFTA 
I FIGAVAAMTGYWLS ERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVS LLELEDRLQCP I CLEVFKES LMLQCX3HS YCKGCLVS 
LSYHLDTKVRCPMCWQAVDGSSSLPNVSLAWVIEALRLPGDPEP 
KVCVHHRNPLSLFCEKDQELICGLCX3LLGSHQHHPVTPISTVCS 
RMKEEL7U^FSELKQEQKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 


6618 


54 8 


136 


DGKVARRAPNS PAFQNDI YPLVSAPRATTAES PWSKVLQNTQCR 
NVPKMTS ERSRI P CLS AAAAEGTGKKQQEGRAMATLDRKVPS PE 
AFIiGKPWSSWXDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PASSEVLTAAVMFLLLNCIVAVSQNMGIGKNGDLPRPPLRNEFR. 
YFQRMTTTSSVEGKQNLVIMGRKTWFSIPEKNRPLKDRINLVLS * 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESDTFFSE IDLEKYKLLPE YPG : 
I LSDVQEGKHI KYKFE VCEKDD 


6620 

* 


3 

* 

« 

• 

• 
• 


1879 

• 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMSITSFPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGSQDGS PLRETRKDPFSAAAAECSCRQDGLTVIVTACIiTFATG 
VTVALVMQ I YFGDPQ I FQQGAWTDAARCTSLGIEVLS KQGSSV 
DAAVAAALCLGI VAPHS SGLGGGGVMLVHD IRRNESHL I DFRES 
APGALREETLQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
QVLAFAAAVAQDGFNVTHDLARAIAEQLPPNMSERFRETFLPSG 
RPPLPGSLLHRPDIAEVIJDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGGVITEEDFSNYSAIjVEKPVCGVYRGHLVLSPPPPHTGPALI 
SALNILEGFNLTSLVSREQALHWVAETLKIALALASRLGDPVYD 
S T I TESMDDMLS KVEAAYLRGH INDSQAAPAPLLP VYE LDGAPT 
AAQVLIMGPDDF I VAMVSSLNQ P FGSGL ITPSG ILLNSQMLDFS 
W PNRTANHS APSLENS VQPGKRPLS FLLPTWRPAEGLCGTYIiA 
LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 

S 1 PHAANMCJ 


6621 


1 


662 


VQGITSYQQRLQALRKEKSRDAARSRRGKENFEFYELAKLLPLP 
AAI TSQLDKAS I IRLT I SYLKNRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRSPSALAIEVFEAHLGSHILQSLDGYVFALNQEG 
KFLYISETVSIYLGLSQVELTGSSVFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


319 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSCIIKRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
I FDMAGHPFFYEVRKPF 


dd2 J 


i nor 


189 


KAIiFEKVKKFRLHVEEGDILYAM iVKQTVLKv IKFJjI I lAiN&A 
LVSKVQFTVDCNVDIQDMTCYKNFSCNHTMAHLFSKLSFCYLCF 
VS I YGLTCLYTL Y WLFYRSLRB YS FE YVRQETGFDDI PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNNEWTPDKL 
RQKLQTNAHNRLELPLIMLSGLPDTVFEITELQSLKLEIIKNVM 
I PAT I AQLDNLQELS LHQCSVKIHSAALS FLKENLKVLSVKFDD 
MRELPPWMYGLRNLBELYLVGSLSHDISRNVTIiESLRDLKSLKI 
LSI KSNVS K I PQAVVDVS SHLQKMCI HNDGTKLVMLNNLKKMTN 
LTELELVHCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLSYNDIRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
LPPELGDCRALKRAGLWEDALFETLPSDVREQMKTE 


6624 


218 


1786 


gsrrgggsri pavsthvapgrsvlrpfasgalrlrslvkalggc 
rgrpsglahlsqetshwrakrsgraclgdfpgeilrsfimkcta 
rewiirvttvlfmarai pamwpnatllekllekymdedgewwia 
kqrgkraitdn™qsildlhnklrsqvyptasnmeymtwdvele 
rsaeswaesclwehgpasllpsigqniigahwgryrpptfhvqsw 
ydevkdfsypyehecnpycpfrcsgpvcthytqwwatsnrigc 

AINLCHNMNIWGQIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIERQQSQVHDT 
HVRTRSDDSSRNEVISAQQMSQIVS CEVRLRDQCKGTTCNRYEC 
PAGCLDS KAKVIGS VHYEMQS S I CRAAIHYGI IDNDGGWVDITR 
QGRKHY F I KSNRNG I QT I GKYQ SANS FTVS KVTVQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 


1124 


543 


PGPRGGGGSLLSTKALGRSRGLGMHPGPSSGGTEGGVPTALRPP 
GPLVPSTSDDNLLKNI ELFDKLALRFHGRLLFLKDVLGDE ICCW 
S FYGQGRKI AEVCCTS I VYATEKKQTKVEFPEARIFEETLNILI 
YETPRGPDPALLEATGGAAGAGGAGRGEDEENREHRVRRIHVRR 
HITHDERPHGQQIVFKD 


6626 


3 


1498 

- 


SAVEFVYTDRFHLILGISVEFLCSLRSDATMESITACLHALQAL 
LDVPWPRSKIGSDQDSGIEIJ^LHRVILTRESPSIQIiASLEW 
RQ 1 1 CAAQEHVKEKRRS AEVDDGAAEKETLPE FGEGKDTGGLVP 
GKSLVFATLELCVC I LVRQLPELNP KLTGSPGVKATKPQI LLED 
GSRLVSAALVILSELPAVCSPEGS IS ILPTIL YLTIGVLRETAV 
KLPGGQLS STVAASLQALKGI LS S PMARAEKS RTAWTDLLRS AJj 
TTI LDCWDPYDETHQELDEVSIiLTAITVPILSTSPEVrTI PCLQ 
KRCIDKFKATLEIKDPWQIKTYQLLHSIFQYPNPAVSYPYIYS 
LASCIMEKLQEIDKRKPENTAELEIFQEGIKVLETLVTVAEEHH 
RAQLVACLLP I LI SFLLDENSLGSATS IMRNIiHD FALQNLMQ IG 
PQYSSVFKSLVASSPALKARLEAAIKGNQESVKVKIPTSKYTKS 
PGKNS S IQLKTS FL 


6627 


. 1 


697 


GI PHIiSSRDMTGTPGAVATRDGEAPERSPPCS PSYDLTGKVMLL 
GDTGVGKTCFLIQFKDGAFLSGTFIATVGIDFRNKVVTVDGVRV 
KLQIWDTAGQERFRSVTHAYYRDAQAT.T,T,T»YDITNKSSFDNIRA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETSAKTGMNVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAE FGGGSGGGGGSGGGGSGGGRGAGGEENKENE RPS AGS KAN 
KEFGDSLSLEILQIIKESQQQHGLRHGDFQRYRGYCSRRQRRLR 
PCTLNFKMGNRHKFTGKKVTEELLTDNRYIiL\TLMDAERAWSYAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRFEHQEWBCAAIEAFNKCKTIYEKLASAFTE 
EQAVLYNQRVEE I S PNI RYCAYN IGDQSAINE LMQMRLRSGGTE 
GLLAEKLEALI TQTRAKQAATMSEVEWRGRTVPVKIDKVRIFLL 
GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 

RALLQQQPEDDSKRSPRPQDLIRLYDIILQNLVELLQLPGLEED 
KAFQKEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANEVNSDAGAFKNSLKDLPDVQEL ITQVRSE KCSLQAAAI LDA 
NDAHQTETSSSQVKDNKPLVERFETFCL.DPSLVTKQANLVHFPP 
GFQPIPCKPLFFDLALNHVAFPPLEDKLEQKTKSGLTGYIKGIF 
GFRS 


6629 


5653 


4549 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFEDFPETSEPVWILG 
RKYS I FTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine , M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 


• 






MLRCGQMI FAQALVCRHLGRDWRWTQRKRQPDS YFSVLCT AFIDR 
KDS YYS IHQ I AQMGVGEGKS IGQWYG PNTVAQVLKKLAVFDTWS 
SLAVH IAMDNT WMEE I RRDCR TS VP CAGATAFPADSDRHCNGF 
PAGAEVTNRPS PWRPLVLLI PLRLGLTDINEAYVETLKHCFMMP 
QSLGVIGGKPNSAHYFIGYVGEELIYLDPHTTQPAVEPTDGCFI 
PDESFHCQHPPCRMSIAELDPSIAVVRGGHLSTQAPGAECCLGM 
TRKTFGFLRFFFS MLG 


6630 


2 


423 


LVQCGGIRRRS AWGAMPGRHVSRVRAL YKRVLQLHRVLP PDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATAIiLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SESMKPKF 


6631 

* 


2 


423 

• 


LVQ GGGI RRRS A WGAMPGRHVSR VRAL YKR VLQLHRVL P PDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQEIiMQEATKPNRQFSI 
SESMKPKF 


6632 


1273 

r 

t 


588 


WNSRGRTQRGAAPLAPAAAMKAVVQRVTRAS VTVGGEQ I SAIGR 
GIC^LGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYEILCVSQFTLQCVLKGNKPDFHLAMPTEQAEGFYNSFLEQL 
RKTYRPELI KDGKFGAYMQVHIQNDG PVTI ELES PAPGTATSDP 
KQ LS KLEKQQQRKE KTRAKGP SES S KERNTPRKEDRS AS SGAEG 
DVSSEREP 


6633 


1145 


617 


ATGRHEGVPTLEG I IQQLVNGI ITPAT I PSLG PWGVLHSNPMDY 
AWGANGLDAI ITQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 
VCRKS LTGQNTATNPPGLTGVSFSS S S SSSS SS S PSNENATSNS 


6634 

* 


1 

* 


1134 

• 

« 


CGGIPRKGSGPRRRLPMARLRDCLPRLMLTLRSLLFWSLVYCYC 
GLCAS IHLLKLLWS LGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLLLHGFPEFWYSWRYQLREF 
KS E YRWALDLRG YGETDAP I HRQNY KLDCL I TD I KD ILDS LG Y 
S KCVL IGHDWGGM I AWL I AI CYPEM VMKL I VI NFPHPNVFTE YI 
LiRHPAQLLKSS YYYFFQI PWFPEFMFS INDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPLKH 
HMVTT PTLLL WG ENDAFMEVEMAEVTR F YVKN Y FRLTTL S EASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 * 


.1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLGPHGPSFARVPVAPSSSSG 
GRGGAEPRPLPLS YRLLDGEAALPAWFLHGLFGS KTNFNS I AK 
I LAQQTGRRVLTVDARNHGDS PHS PDMS YE IMSQDLQDUjPQLG 
LVP CWVGHSMGG KTAMLIaAIjQRPELVERL IAVD I S P VE S TGVS 
HFAT YVAAMRA I N I ADEL PRS RARKLADEQLS S VT QDMA VRQHL 
LTNLVETOGRFVWRVNLDALTQHLDKILAFPQRQESYLGPTLFL 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSS PCFHDGTC VLDKAGS YKCACUVG YTGQRCENLLEAGKS KI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRHAKIGT 
WSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
LVRRRVLPMQVQSRETPLHQLYSAAFSKQKIiQSAPTKKPALPFG 
DLPMGYQHLHTQLQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 
CIPICGKIENITAPKTQGLRWPWQAAIYRRTSGVHDGSLHKGAW 
FLVCSGAIiVNERTVWAAHCVTDLGKVTMI KTADLKWLGKFYR 
DDDRDEKTIQSLQISAIILHPNYDPILIiDADIAILKLLDKARIS 
TRVQPICLAASRDLSTSFQESHirVAGWNVLADVRSPGFKNDTL 
RSGWSWDSLLCEEQHEDHGIPVSVTDNMFCASWEPTAPSDIC 
TAETGGIAAVSFPGRASPEPRWHLMGLVSWSYDKTCSHRLSTAF 
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SEQ 

ID 

NTH - 


Predicted 
Beginning 

IXUClcui. luc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

iOCaClon 

cor re sp ending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lH a Aianine, C B iysueine, ysiispartic ncia, 

filnham^r 2. r» -? r3 17 — Phpnvlalariinp f?=f-i7 y/ , <"* ■( no 
0-Lu.l.ciiuXvJ AtlUf r - rucuyioiaiiLUCf vj-vj.Lyv--t.iiC/ 

H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


c c "i a 

6638 


13 91 


224 


GG1 PQAGGIvMAAF WWK/WUjUnL^KWKb^olfaAViAaKKl FFIjIjVM 
PNSDIDLSNLERLEKYRSFDRYRRRAEQEAQAPHWWRTYREYFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERRA 
RLRTASVPLDAVRAEWERTCGPYHKQRLAEYYGLYRDIiFHGATP 
VPRVPLHVAYAVGEDDLMPVYCX3NEVTPTEAAQAPEVTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYTjHWLLTNIPGNRVAEGQVTCPYL 

TFDFYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FWPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 

** 

i 


1268 


IGCFIMDGGDDGNLI I KKRF VS EAE LD ERRKRRQEEWE KVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFIiDEVSRQQELIEKQRREEELKEIiKEYRNNLKKVGISQE 
NKKEVE KKLTVKP I ET KNKFS QAKLIiAGAVKH KS S ESGNS VKRL 

GAYSGS SDS ES SSDSEGTI NATGKI VSS IFRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRI VLVGKTGSGKSATANTILGEE I FDS 
R IAAQAVTKNCQKASREWQGRDLLWDTPGLFDTKESLDTTCKE 
I SRCI I SSC PGPHAIVLVLLLGRYTEEEQKTVAL I KAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQELVELIEKMVQCNEGAYFSDDIYKDTEER 

YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
ASAPPPRGFSAISCTVEGAPASFGKSFAQKSGYFLCLSSLGSLE 
NPQENWAD IQ I WDKS PLPLGFS PVCDPMDS KAS VSKKKRMCV 
KXLPLGATDTAVFDVRLSGKTKTVPG YLRIGDMGGFAI WCKKAK 
APRPVPKPRGLSRDMQGLS LDAASQP S KGGLLERTASRLGSRAS 
TLRRNDS I YEAS SLYG I SAMDGVP FTLHPRFEGKS CS PLAFSAF 
GDLTIKSLADIEEEYNYGFWEKTAAARLPPSVS 


6642 


22. 

i 
i 


: 1296 

> 


PLEERMMTKMDPNDQAQRDI I FELRR IAFDAESDPSNAPGSGTE 
KRKAMYTKD YKMLG FTNH I N PAMD FTQT PPGMJ-iAJbJLJNMIjYJjAKV 
HQDTYIRIVLENSSREDKHECPFGRSAIEIiTKMLCEILQVGELP 
NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 
KVMQWREQ I TRALPSKPNSLDQFKS KLRSLS YSE ILRLRQSER 
MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 
NRRRQERFWYCRLALI^KVLHYGDLDDNPCXSEVTFESIiQEKIPV 

i\U J. 1\J\L V 1 ir li^JiAJi A^i-5JjrVl^IN XVCj V UdUi-ir O X JLJ x Utr XJo X J-UNT 

I APNKYE YC I W I DGLSALLGKDMS S E LTKSDLDTLLS MEMKLRL 
LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


S LHAPAEGRTRGRLAE KP KMXjTRK I KLWD I NAH I TCRL CSG YL I 
DATTVTECLHTFC^SCLVKYIjEENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDIVYKliVPGLQEAEriRKQREFYHKLGMEVPGDIKGETC 
SAKQHLDSHRNGETKADDS SNKEAAE EKPEEDND YHRSDEQVS I 
C!LECNSSKLRGLKRKWIRCSAQATVLHLKKFIAKKLNLSSFNEL 
DILCNESILGKDHTLKFWVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGS S PVQLVSSTMS VRTLPLLFLNLGGEMLYI LDQR 
r.RAONTT PGDKARKVLjNDI ISTMFNRKFMEELFKPOELYSKKALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQVLLCPRPKDVL 
LVTFNHLDTIKGFIRDSPTI LQQVDETLiRQLTE I YGGLSAGEFQ 
LIRQTLLIFFQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGL I RMFNNKGEEVKRI EFKHGGNTVPAPKEGS FEF YGDRVL 
Kl^TNMYSVNQPVETHVSGSSKMIiASWTQESIAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EVINIQATQDQQRSEELARIMGEFEITEQPRLSTSKGDDIjLAMM 
DEL 
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SEQ 
ID 
NO: 


Predicted 
beqinnincr 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alan i tir C=Cvsteir>^ DtaAsoarhic Acid. E— 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Ms=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


6530 


4646 


FVEGIiAGYVYKAASEGKVLTLAALLrJTRSESDIRYLLGYVSQQG 
GQRSTPLI IAARNGHAKVVRLLLEHYRVQTQQTGTVRFDGYVID 
GATALWCAAGAGHFEVVKLLVSHGANViraTTVTNSTPLRAACFD 
GRLDIVKYLVENNANIS I ANKYDNTCLM I AAYKGHTDVVRYLLE 
QRAD PNAKAHCGATALHFAAEAGH IDI VKELI KWRAAI WNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFAND 

P FMYD T T IfT VTJVT . VT. .PP nnnn n\T TT.P TTFVT .PPT HSVfJKTO TP 

CRNPQEIiESIRQDRDALHMEGLIVRERILGADNIDVSHPIIYRG 

AVYADNMEFEQCIKLWliHAMLRQKGNRNTHKDT^FAQVTSQM 

IHLNETVKAPDIECVLRCSVLEIEQSMNRVKNISDADVHNAMDN j 

YECOTiYTFIjYLVCISTKTQCSEEDQCKINKQIYNIjIHIiDPRTRE 

GFTLLHIiAWSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 

VDNEGNSALHI IVOYNRPISDFLTLHSII ISLVEAGAHTDMTNK 

QNfCTPl^KSTTGVSEILLKTQMK^^ 

IPRTLEEFVGFH 


6646 


176 


890 


PSSRMraLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKG I S DVRRTFCLFVTFDLLF VTLLW I IELNVNGGIENTL 
E KEVMQYDYYS SYFD I FLLAVFR F KVLI LAYAVCRLRHWWAIAL 
TT AVTS AF1YLAKVI LS KLFSOGAFG YVLP 1 1 S F I LAW T E T WFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6647 

• 


176 

• 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTOLMARIESY 
FflREKKGT ^DVPRTFPTiPVTPnT.TiFVTTiT.WT TFTiKFVNf?C!>T RNTT, " 

UV7{\u^\i\wXwi/ V 1U\1 X V»XJ£ V 11 XJXJXJX V X XJXJfV X X JLiXJVV Vli ww X CJLN X XJ - 

EKEVMQYDYYSSYFDIFLIxAVFRFKVTjIIxRYAVCPJjPJIWWAIAL l 
TTAVTSAFLLAKVILSKLFSQGAFGYVIiPIISFILAWIETWFLD 
FKVIiPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GS EEAEEKODS EKPLLEL 


6648 


413 

* - 


897 


P^CWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDYIEDDNPEIj 
IR PQKLINPVKTSRIWQDxxHF^LLMNQKRGLAPQI^PBLQKVME 
KRKRDQVIKQKEEEAQKKKSDLE I ELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6645 


1357 


832 


WIPRAAGIRHEVKWDVKEIMSC3HNIYVDALLKEFEQFNRRLNEV 

QLDFQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWi 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLLAAIDDIDRPKR. 


6650 • 

* 


32 


765 


LVPLVFSLLVQS CKQVYRS I AMKFVPCLLLVTLSCLGTLGQAPR 
QKQGSTGEEFHFQTGGRDSCTMRPSSLGQGAGEVWLRVDCRNTD 
QTYWCEYRGQPSMCQTVFAADPKSYWNQALQELRRLHHACQGAPV 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKP FQALCAFL I S F FRG 


6651 


3425 

• 


1353 


AKELLKVGDFSLCAGPYQNTADTMENLSKEPLASFVSESFDISA 
CX3IATEHVKIDNSGEGLTAEAGSETLSRDGEVGVNSDMHYELSG 
DSDLDLLGDCRNPRLDLEDSYTLRGSYTRKKDVPTDGYESSLNF 
HNIWQEDWGCSSVA^GMETSLPPGHWTAAVKKEEKCVPPYVQIR 
DLHGILRTYANFSITKELKDTMRTSHGLRRHPSFSANCGLPSSW 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKES PTQ IS IGAFPSTKI SEAPFLHPAPRSRS PLLVTWESDP 
RPQGQPRRGYTASSIxDSSSSWRERCSHNRDLRNSQRNHTVSFHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEM YLPFPGRS AS YEDI 1 1 DVCTNLHVKLRS WKEA 
C KSTFLFYLVETEDKS F FVRTKNLLRKGGHTE I E PQHFCQAFHR 
ENDTLI 1 1 IRNEDISSHLHQIPSLLKLKHFPSVI FAGVDS PGDV 
LDHTYQELFRAGGFVISDDKILEAVTLVQLKEIIKILEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKN I MLKS FQSANI I ELLH 
YHQCDSRSSTKAEILKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine , V=»Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNGI LVTDVNNF I ENIEK I AAP FRSS YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRLDHALNSPTSPC 
EEVIKNLSLKAIQLCDRDGNKSQDSGIABMEELPVPHNIKISNI 
TCDS FKI S WEMDS KS KDRITHYF I DLNKKENKNSNKFKHKDVPT 
KIiVAKAVPLPMTVRGHWFLS PRTE YTVAVQTASKQVDGDYWSE 
WSE I IEFCTADYS KVHLTQLIiEKAEVIAGRML KFSVFYRNQHKE 
YFDYTOEHHGNAMQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPQDS P YGRYRFE I AAEKLFKPNTNLYFGDFYCMYTAYHYV 
ILVIAPVGSPGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 
QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 


170 

* 


1910 

• 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAASRGADDAME S S KPGPVQWLVQKDQHS FELDEKALAS I 
LLQDHIRDLDVWVS VAGAFRKGKS F ILDFMLRYLYSQKE SGHS 
NWLGDPEEPLTGFSWRGGSDPBTTGIQIWSEVFTVEKPGGKKVA 
VVLMDTQGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDE I FQKPFQTLMFLVRDWSFPYEYS YGL 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATS PDFDGKLKD IAGEFKEQLQALI PYVLNPSKLME KE ING 
S KVTCRGLLE YFKAYI KI YQGEDIiPHPKSMLQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQLALDHFKKT 
KKMGGKDFSFRYQQELEEEIKELYENFCKHNGSKNVFSTFRTPA 
VLFTG I VALYIASGLTGF IGLE WAQLFNCMVGLLL IALLTWGY 
I RYSGQYRELGGA1DFGAAYVLEQAS SHIGNSTQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTSLSPSQCSSFNLAMASAGMQILGVVLTLLGWVNGLVSCALPM 
WKVTAFIGNS I WAQWWEGLWMS CWQSTGQMQCKVYDSLLAL 
PQDLQAARALCVI ALLVALFGLLVYLAGAKCTTC VEE KDS KARL 
VLTSGIVFVISGVLTLIPVCWTAHAVIRDFYNPLVAEAQKRELG 
ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP . 
AISRGPSEYPTKNYV 


6655 • 


341 

> 




KDAYMFKKGLLALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA 
INI PLKEVKERIATAVPDKM^TVKVYCNAGRQSGQAKE ILSEMG 
YTHVENAGGLKD I AMP KVKG 


6656 


2 

* 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVPPPQKRMAKVAK . 

DLNPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 

KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 

KRNRMIMTNPIATGKDPTFDTITYEWAPPGVTQKLGLQYMELIP 

KEKQPVTGTEGAFYRRRQLMHQLP I YDQDPSRCRGLLENELKLM 

EEFVKQYKSEALGVGEVALPGQGGLPKEEGKQQEKPEGAETTAA 

TTNGSLSDPSKEVEYVCEIiCKGAAPPDSPWYSDRAGYNKQWHP 

TCFVCAKCSEPLVDLIYFWKDGAPWCGRHYCESLRPRCSGCDEI 

I FAED YQRVEDLAWHRKHFVCEGCEQLLSGRAYI VTKGQLLCPT 

CSKSKRS 


6657 


830 


2120 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDL INLTQEDFKKPPLCRVS SDNGQRLLDM I ETLKMEH 

IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 
VPPKEVQPPLPDTFFDHFNRVQWAFS I CEINGMILVGLWLIQWL 
LLKYKS I ISRRFFCIVGTLYLYRCITMYVTTLPVPGMHFNCSPK 
LFGDW E AQLRR I MKLI AGGGLS ITGSHNMCGDYLYSGHTVMLTL - 
TYLF I KEYS PRRL WWYHW I CWLLS WG I FCI LIAHDHYTVD WV 
AYYITTRLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 


6658 


35 


855 


HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanme, C=> Cyst erne, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosxne, X«Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QTPEGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
SPGLSMPSSSPFIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 
IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 
MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFFCWGFWIiWRAHSMSNLHSLPGIj 
RGLTSISRNQLQCTNAMRVINNYQRRWKNQNTFIiliATFANWNV 
CGNPTITCPHNRTLNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQTPANMFYIVACDNRDQRRDPPQYPWPVHLHTII 


6660 

• 

• 


514 

■ 


1707 


CAAS LDCRHHLCE PDMKLVWPS AKLLQAAAGASARACDSVTSNV 

LPLLLEQFHKHSQS sqrrti lemllgflklqqkws yedkdqrpl 

NGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPDLLSYED 
LELAVGHL YRLS FLKE DS QS CRVAALEASGTLAAL YP VAFS SHL 
VPKLAEELRVGESNLTNGDEPTQCSRHLCCLQALSAVSTHPS IV 
KETLPLLLQHLWQVTOGNWAQSSDVIAVCQSLRQMAEKCQQDP 
ESCW YFHQTAI PCLLALAVQASMPEKEPSVLRKVLLEBE VLAAM 
VSVI GTATTHLS PELAAQSVTH I VPLFLDGNVS FLPENS FPSRF 
QPFQDGSSGQRRLIALLMAFVCSLPRNVSEHIWEVLLFNLDKVT' 
PG 


6661 


179 


430 


GVHAASGTLSATWLAEAKMFDSLAKAGKYLGQAAKljM igmpdyd.. 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RSLPKPAPAQPASIHCARFSGVTPPTAKTAMSDGNTAFNALMYC 
GPKADDGN I FSACAPASSAVKAS VS VAQPGQAVT P 


6663 

* 


3 

■ • . ** 


1005 


RPVLSSRVDDFVPPLPETSGRRKKLERMYSVDRVSDDIPIRTWF 
PKENLFS FQTAS TTMQAI SNFRKHLRMVGS RRVKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
BKGLEAVACDTEGFVPPXVMIilSSKVPKAEYI PTI IRRDDPSI I 
PIIiYDHEHATFEDILEEIERKLNVYHKGAKIWKMLIFCQGGPGH 
LYLLKNKYATFAKVEKEEDMIEFWKRLSRLMSKVNPEPNVIHIM 
GCYI LGNPNGEKLFQNLRTLMTP YRVTFESPLELSAQGKQMIET . 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6664 


58 

t 

* 


968 


PRLLRLPRSVVVMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNP I SSWFTAMLHCFGGGILSCLJjLAEPPIiKF 
IiANHTNILLASS I WY I TFFCPHDLVSQGYSYLPVQLLASGMKEV 
TRTWKI VGGVTHANS YYKNGWIVMIAIGWARGAGGTI ITNFERL 
VKGDWKPEGDEWLKMSYPAKVTLLGSVIFTFQHTQHLAISKHNL 
MFLYTI FIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
SCEKKSEAKSPSNGVGSLASKPVDVASDNVKKKHTKKNB 


obbb 


171 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRW LGPGCTQNPCS VHTATGPE PRKL PLLPPDS PNSG YPKE PA 
ALCPGI PSPCRMTHQDLS I TAKL INGGVAGLVGVTCVFP IDLAK 
TRLQNQHGKAMYKGM IDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAI KLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATIiIAWELLRTQGLAGLYRGLGATLLRDIPFS 1 1 YFPLFANLNN 
LGFNELAGKASFAHSFVSGCVAGS I AAVAVTPLDVLKTR IQTLK 
i\vj LiijniuUJt. i lj\ — rvrv. 


6666 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTIi 
WSPGCWPQPIQKEGVGLWDIRKPQSSLLRYGGNLSLQSAMSVRF 
NSNGTQLLALRRRLPPVLYDI HSRL PVFQFDNQVYFNSCTMKS C 
CFAGDRDQY I LSGS DDFNL YMWR I PADPEAGGIGRWNGAFMVL 
KGHRS I VNQVRFNPHTYMICSSGVEKI I KI WS P YKQ PGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
FFDSLVRREIBGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADNAFHIiGPLRVTTTNTVASTPPTPTCED 
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SEO 
ID 
NO: 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


pTprfi rf ed end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment nonhaini na sianal Deotide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Tnreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, ,X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PRS PSPEDESS SSS S SS S SEDBEETjNERRAS TWQRNAMRRRQKT 
TREDKPSAPIKPTOTYIGEDNYDYPQIKVDDLSSSPTSSPERST 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPS SSKEACLNIAMAQRNQDLP P 
EGCSKDTPKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEVV 
AYS S PGHSDTDRDNSSLTGTLLHKDCCGS EMACETPNAGTREDP 
TDT PATDSSPAVHGHSGIjKROR I E IiEDTDS ENS S SE KKLKT 


6667 


171 


1310 


AEEVERLAAMRSDSLVPGTHTPPIRRRSKFANLGRIFKPWKWRK 
KKS EKFKHTSAALERKI SMRQSREELI KRGVLKE I YDKDGELS I 
SNE EDS LENGO SLS SS OL S TjPALS EME PVPM PRD PCS YEVLOP S 
DIMDGPDPGAPVKLPCLPVKLiSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVIjPSQIQHQIjQYGSHGQHLPSTTGSL 
PMHPSGC^IDELim'liAMTMQR 

GDGVTKAGPMGLPEIRQVPTVVIECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELSE KN I LPROTDE ERLE LROOIGT KL 


6668 


714 


358 


TLAVATGPALTLRCHVCTS S SNCKHS WCPAS S R FCKTTNTVE P 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKIjH 
NAAPTRTAJ^SALSI^LAiSLI^VILAPSL 


OOD J 


ti 3 5 


1 ?07 


KTJFFTP Tf nYnYMT.nWPFRYY^HYYWYYSRPTjAPKVDVPVVT T ,\7^ 
VCAI SVFQFFSWWNS YNKAI S YLATVPKYRIQATE I AKQQGLLK 
KAKEKGKNKKSKEEIRDEEENIIKNIIKSKIDIKGGYQKPQICD 
TiTiTjFQI ILAPFHLCSYIVWYCRWIYNFNIKGKEYGEEERLYI IR 
KSMKMSKSQFDSLEDHQKETFLKP^LWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEGPGRLTFVDD 


O O / VI 


> ■ 


* 

a 


\7 ZVP T + n P A A KTVI c; Q F p P PP V PGGPT A PT J iFFTfS G A P P T PGR <5 <? P a 

VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM 
PPGFYP P PGPHP PMG YYP PGP YTPG P YPGPGGHTATVIiVPSGAA 
TTVTV 


6671 . 


1 


: 763 


LPAE KPRSAPNMAGGRCKPQLTALLAAWI AAVAATAGPEEAALP 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKTJ(3EILOISVGKVDVIOEPGTjSGRPFVTTLPAFFHAiCDGIFPP 

YRGPG I FEDLQNY I LEKKWQS VE PLTGWKS PASLTMSGMAGLFS 
ISGKIWHLHNYFTV^IX3IPAWCSYVF1^IATLWGLSMDLVL*V 
ISQCNWDPP YRHVS * /RPSTNLGVHTAHTSEHLRL 


6672 


3 04 


1089 


APGSKP VQFMDFEGKTS FGMS VFNLSNAI MGSGILGLAYAMAHT 
GVIFFLALLLCIALLSSYSIHLLLTCAGIAGIRAYEQLGQRAFG 
PAG KWVAT VI CLHNVGAMS S YL F I IKSEIjPLVIGTFLYMDPEG 
DWFLKGNLLI I IVSVLI ILPLALMKHLGYLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFHS*LTCVIjTQWPIMAFAFVCHPGGAGPSITELCRAFQAQD 


6673 


1116 


1963 


LQIQTHHTHHGARVTHLGSHQlxLANAGTMLCROX}SSSMAPA 

S VTCGP s pcvrkqes atkclhigacgsdlwargweqg*g* glnv 
wlcpcvafhrgarpqaeeggarwnslvsspwippnp*hssigae 
navprp*qg*kvnpsgqerqs\wvlplpvpgeplklpglpg*nk 
sfsrv/sgskgkwilprqlm*as*r\tprfvpgtqwvpitw/pl 
itwh*saptpplkacpapp^sdpcssclscpcvtqhprfsiyrgw 
fgaghchsscdftrkgaaggpg 


6674 


1 


440 


le fd ymcq yd yve vrdgdnrdgq 1 1 krvcgnerpap iqsigssl 
h vl fhsdgs knfdgfhai yee i tac s ss p cfhdgtc vldkags y 
kcaclagytcqrcej^lleerncsdpg/wpsqwvpennrgpwayq 
ptpc* igtrvafflt 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment containina sicinal peptide 
(A=Alanine,. C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVLI DHDVD 
LEKIHPPSMPGDSGSEIOGSNGETOGYVYAOSVDITSSWDFGIR 

4JMA\ih44ii 47 fcW> * 1*7 VJ Arias' WW YfWtJllWlJ X A V X *»>£W F A *W> kW 7 » • 4W la \_J X. J. V 

RJRSNTAQRLERLRKERQNQI KCKNIQWKERNS KQSAQELKSLFE 
KKSLKEKP P I S GKQS ILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPl^SSQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
G FSTLALVEKY S S PGLTS KES LF VR I NAAHGF S L IQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADG VFEEDSQ I DI ATVQDMLS SH 
HYKS FECVSMIHRLRFTTDVQL/GCALFPG VLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KAS TKFWI KQKP I S ID S DLLCAC \ DLAEE 




277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVLI DHDVD 
LEKIHPPSMPGDSGSEIOGSNGETOGYVYAOSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQ I KCKNIQWKERNS KQSAQELKSLPE 
KKSLKEKP PISGKQS I LSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC+EGISGDKVEIDPVTNQ 
KASTKFWIKQKP I S IDSDLLCAC\DLAEE 


6677 

• 

• 

ft ^ 


277 

* 

* i ' * 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVLIDHDVD 

RRSNTAQRLERLRKERQNQ I KCKN I Q WKERNS KQSAQELKSLPE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPl^SSQDRlJLjPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKBSLFVRINAAHGFSLIOVDNTICVTM 

KE ILLKAVKRIlKGSblWSGSPADGVFEEDSQIDIAiVQDMLSSH 
HYKS FKVSM I HRLRFTTDVQL/GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS *DSHKC* EGISGDKVEIDPVTNQ 
KASTKFWIKQKP I SIDSDLLCAC\DLAEE 


6678/ 


221 


865 

• 


GPSNQSSGSLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*LR 
PFPCSQLPMSQGCLWHLDCCCPWVP Y I PGQQ WRKGRQRMRN * QS 
LLG SDQE S VGL EDL CVFVN FLLHVLLGLFP * PHE LFLLP WDLG 
FLFPLLLQGGCHCLVLPANLVSQAPQIGKLSCRLQTHDLEGSRN 
HHPLFLWGRWDAVKHLETVQSGLAS LGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRS PGQNWVKTVDGWKRFLDEKSGS FVSDLi 
SSYCNKEVYNKENLFTJSLNYD/ SCSQEEKEGHAE*QNQNS\DFH 
QEKWIYVmCGSTKERKGYCTIiGEAFNRLDFSTAILDSRRFNYW 
RLLE L IAKSQLTSLSG I AQKNFMNI LEKWLKVLEDQQN ITLIR 
ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGLF 


6680 


1498 


2951 


PLCTLPLMPSALPGWAGERWEKQWPLA/PGPGTWQTPVGSISEE 
P\RKNEPDTHCPRGEARPEV*HLPKPHSPGSEGAEIQTSA*ALP 
/NQVS PPQPM * GAEENGDQRGGKE BAGEELHRS S SGLTAAPGFP 
EVHRNLQTFPGLPSRGGGP / GGAGTQGSWAPGEQPP/SPLLPAS 
MQRSQAGLPGWEAGLVES PTHHI PALRPSGTNATGEAFPSTTCS 
SGP\PAPPGPTGLRPGGGSSSGGHG**PGLPVGKV\GALGAAQD 
PQSQGRGPTQGTVGTEMLLSGLGSAKACPAARPAVP*LPSDPAS 
TI PKKGTRGFGEGPGVLQERNRWWGRAQGFTS ADAAGTAP PGV 
* LPAPLSQPPGATE PQVRACGMAPPS PGTSGRLVAWGRHPG PQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH* 
WQDPPSS PRTGCLPGI PARQAYSAPRTRSRPG I RTGRAAYGF IR 
FQGGGGG 
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SEQ 

in 

NO: 


Predicted 
ftsy inning 
nucleotide 
location 
corresponding 
to first 
amino acid 

ICOJLUUC \Ji- 

amino acid 
sequence 


Predicted end 

IluClcOLlQc 

location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide 
lA*=AJ.anine, c=cysteme, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
w=irypcopnan, x»iyrosine, A»urucnown, *=stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 


1169 


511 


INYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSERE 
KMTVGVLTQTVGPWSRPGAYLSKQLDGVSKGWPPCPRALAATAL 
IiAQEADELTLRQNLNRKSPHA\VVTLINTKGHH*LINARLTRYQ 
TLLCENPHKT IEVSNT/ LNPATLLLVTBS PVTCHNCLEVLDS VYS 
SRPNLRDHP * TS VDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


booZ 






TVIjCGAMQVSSIjNEVTCIYSLSCGKSIjPEWLSDRKKRALQKKDVD 
VRRRIELIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCD INS VHGLFATGT I EGRVECWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSIiPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 
NSGKIFTSL^PEHDI^Va^YPNSGMLLl'ANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


DOOJ 






TVLCGAMQ VSS LNE VK I YS LSCGKS L PEWLS DRKKRALQKKD VD 
VRRRIELIQDFEMPTVCTTIKVSKDGQYIIiATGTYKPRVRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIBFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEWRLNLEQGRYLN 
P JjQTDAAEJNNVCX) INS VHGLiFATGT I EGRVE CWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCS FLDNLTEEIjEENPE SNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA 
PP \SGRRHA*RPA*WLGGPGGD£GGREEGGS / GELQRAMESKMG 
ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


258 

< 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRIjFDPDPQKVL 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
ECAVVIjGSIiAMGTENNVKSLLDCHI I PALLQGLLSPDLKFIEAC 
LRCLRTIFTSPVTPEELLYTDATVlPHLMALLSRSRYTQEYICQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSyKVRMQALKCFS 
VLAFENPQVSMTLVNVLVDGELLPQI FVKMLQRDKP IEMQLTSA 
KCJjT x M CKAGAI RTDDNC I VDECTLPCLVRMCSKERLLEERVEGA 
ETLAYLIEPDVELQRIASITDHLIAMliADYFKYPSSVSAITDIK 
RLDHDLKHAHE LRQAAFKL YASLGANDED I RKKVS LGEGR PP VL 
TASRQGVTST 


6686 


310 


927 


DS VTFDDLAVD FTP KE WTLLD E>TQ RNL YRDVMLENY KNLAT VG Y 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 

SENTFECYLYGVD FLTLHKKTSTGEQRS VFSHVWKKPS S LNPDV 
VCQKNRCTRKKKAF * LQLTLGKSFH * S IHT 


6687 


181 


915 


E AMLE AP YKKE E DEQQRKE VKKDYPSNTTS S TSNSGNETSGS S T 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDRVHYRSPPLATGEPVDNLSPEERDARTVFCMQLAAR 
I RPRDLEDFFSAVGKVRDVRr I SDRNSRRS KGIAYVEFCE IQS V 
PLAIGLTGORLLGVPI IVOASOAEKNRLAAMANNLOKGNC5GPMR 
IiYVGSLHFNITEDMLRGI FEPFGKV 


6688 


1025 


1 


AEVPOTPRVFHKCPDSCWRFKFQPIQLQPYILLSFSSEKPPISF 
SEPGLPR/SATARMATAAAPPNSSIDLPSDSGMGFISPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
TMSELEELFSIiFSPAPLLSKLFTSSGS IAICCQDSGPSDTGRLS 
VCQLWIiADSDTGKLSDCQEVVTVGDSGGLTCPELSLGRM*MSLL 
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WO 01/53312 PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

becrinnincx 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 

fA=Aianiin^ PnfVcihp ^ n** n-Acnarl-ir Arid c— 
Glutamic Acid. F=Phenvlalanine G=Glvc?in^ 
H=Histidine, I=Isoleucine, K=Lysine, 
I*=Leucine, M=Methicnine, N=Asparagine, 
P=Proline r * Q=Glutamine, R^sArginine, 
S=Serine, T=Threonine, V^Valine, 
W=Trvot oohan . Y— Tvrosine XaUnlcnovm. on 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSAGYATGATG T GnTARGnGT.lTT.lfWfTr.T.CaT nQQCDTC*CTOC 

AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSK7SI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQLAMSIi+ATKF*RNACNPNCLSSKKSAL*LSLNQRF 
GGSASRKPGNISFNSQKCSALS YCCNFVI KPREVS VS S ENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YIRLTPDMQSKQGALWNRVPCFLRDWELQVHFKIHGQGKKNL\H 
GDGLAIWYTKDRMQP 


6691 

W W -A» 

V ' 
* * 


\ 287 


14 01 


ujvx r» x o CtoAniucx JvuKJroyi-uN>iv r ^aUJ\i\n x 1 LuiiD VAV 
DFTWBEWQLLGAAQKDLYRDVMLENYSNLVAVGYQASKPDALFK 
LBQGEQLWTIEDGIHSGACSpIWKVDHVLBRLQSESLVNRRKPC 
HEHDAFEN I VHCS KSQFLLGQNHDI FDLRGKS LKSNLTLVNQSK 
GYEI KNS VEFTGNGDS FLHANHERLHTAI KFPASQKL ISTKSQF 
ISPKHQKTRKLEKHHVCSECX3KAFIKKSWLTDHQVMHTGEKPHR 

HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/GKGFIQKTCLIAHQRFHTER 


6692 


178 

• 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQGFNKLAETLRWCLNLG ILEVTVYAFS I ENFKRS KS EV 
DGLMDtiARQKFSRLMEEKEKLQKHGV 

DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFIiLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 

■ 


939 


WIKEGELSLWERFCANIIKAJ3PMPKHIAFIMDGNRRYAKKCQVE 

K.yr»ljrioyorlN J\JjMniXXJltn^XAlNxAjXXjll« V X v xrvT OXlMUr XvKolvofciV 

DGLTOIjARQKFSRx^EEKEKLiQKHG 

IAQAVQATKNYNKCFLNVCFAYTS RHE I SNAVREMAWGVEQGLL 
DPSDISElSLLDKCLYTNRSPHPDIIiIRTSGEVRLSDFLLWQTSH; 


6694 

■ 


292 

♦ 


813 


SLLLHLAPPGAYTPSQPLSS VSTETAS S VRRQAAESRQHELP VR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
1UAVGPSGCOTEP\FDEWPSLFLGDAYAAPJDKSKLIQ1^ITHVV. 
NAAAGKFQ VDTGAKFYRGMSLEYYG I EADDNPFFDLS VYFLP 


6695 


292 


813 


SLLLHIxAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR. 
EVHSIiGOITiPOnGT.TAFAGPPFAnnPWflGPriT GT.P2V AUTm?l\ A a 

1^VGPSGCHTEP\FDEWPSLFLGDAYAARDKSKLIQ1^ITHVV 
NAAAGKFQVDTGAKFYRGMSLE YYGI EADDNP FFDLSVYFLP 


6696 


1 


782 


PR\^GRVGERWAFLSVPAAMSSE^PLLIiAWSYFRRRKFQLCAD 
LCTQMLEKSPYI^AAWILKAPJ\LTEMVTrDEIDVDQEGIAEMMIi 
DENAI AQVPRPGTSLKLPGTNQTGGPSQAVRP ITQAGRP I TG FL 
R PS TO SGR PGTMEO ATRT PR.TAYTAR P I TS S SGRFVP r /3T AS MT . 

TSPDGPFINLSRLl^TKYSOKPKLAKAL IE YI FHHENDVKTALD 
IxAALSTEHSQYKDWWWK/DQ I EKCYYRVGMYREAE KQIKSS 


6697 


3 


782 


PPLFLRRMSx^ALRPGSRKVMAVVPASLSGQDVGSFAYLTIKDR 
I PQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISlxLSKxxRNE 
LQTDKPFIPLVEKFVimilWNQYI^YQQSLLNESDGKSRWFYSP 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIED1jD\ENQLKDEFFKLLQISLWGEISVDIi 
SLXSGGESSSQOTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 


668 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP/PARRVLPRAMTASAQPRGRRPGVGVGWVTSCKHPRCV 
LLGKRKGSVGAGSFQLPGGHLEFGETWEECAQRETWEEAALHIaK 
OTHFASVWSFIEKElvTx^iYVTILMKGEVDVTro 
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SEQ 
NO * 


Predicted 

Karri ***** A virr 

Dcginll my 
nucleotide 

location 
corresponding 
to first 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieoc xcte 

corresponding 
to first 

ami nfl a Pi 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=Aiamne, CssCysteine, D=Aspartxc Acid, E= 

oxuucimjLw. a^.j.u, r s ir iiciiy i, a i aulllc , u = uiyciilC| 

H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 

S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








ESKRI I YNHAF F FQES KWSGGILQ 


6700 


1098 


1392 


TQCWRSSTPGMRTHFRTQP / RLECGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVLMEVEVEAKANGEDCLNQVCRRLGI 
IEVDYFGLQFTGSKGESLWLNLRNRISQQMDGLAPYRIjKIjRVKF 
FVEPHLILQEQTRHIFFLHI KEALLAGHLLCS PEQAVELSALLA 
QTKFGDYNQNTAKYNYEELO\KELSSATLNSIVAKHKELEGTSQ 
ASAEYQVLQI VSAMENYG I E WHSVRDSEGQKLL IGVGPEGI SIC 
KDDFSPINRIAYPWQMATQSGKNVYLTVTKESGNSIVLLFKMI 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDLKGHLASLF 
LNEN INLGKKYVFD I KRTS KE VYDHARRAL YNAG WDLVSRNNQ 
SPSHSPIiKSSESSMNCSSCEGLSCC^TRVLQEKLRKLKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSb 
LLTGSRSQVLAR 


6702 

i < 


397 


1971 

* , 


PIAKFLKLDLVNVLCLPMEDVFLFYRTCFCSMGLGSSCHLSLPK 
RAEALLCSRKATWRDLVAVRMAEEQE FTQLCKLPAQPS HPHCV 
NNTYRSAQHSQALLRGLLALRDSGILFD WLWEGRHIEAHRI L 
LAASCDYFKGMFAGGLKEMEQEEVLIHGVSYNAMCQILHFIYTS 
ELELSLSNVQETLVAACQLQIPEIIHFCCDFLMSWVDEENILDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LS SNRLEVSCETEVYEGAIjLYHYSIjEQVQADQ I S LHEPPKLLET 
VRFPLMEAEVLQRLHDKLDPSPLRDTVASALMYHRNESLQPSLQ 
SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASLAPRMSNQG I AVIiNNFVYLIGGDNNVQGFRAESRCWRYDP 
RHNRWFQIQSLQQEHADLSVCWGRYIYAVAGRDYHNDLNAVER 
YDPATNS WAYVAPLKREVYAHAGATLEGKMY I T CGRKGR I T 


6703 

* 


45 

* « 

-** 


' 1244 

■ 

v. 

■v 
ft 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAEIGQNHQGDIjDVA : 
KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLEFSHDQYRELQRYAEEVGIFFTASGMDEMAVEFLHE. 
LOTPFFKVGSGDT^FP YLEKT?AK/ TRGWHS VtiRDVCG VQLNDE 
TS S WDVLGRWTS KEKVIJ4VLVLDYSGRPMVI SSGMQSMDTMKQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 
. IGYSGHETGIAIS VAAVALGAKVLERH I TLDKTWKGSDHSASLE 
PGELAELVRSVRLVERALGS PTKQLLPCEMACNEKLGKS WAKV 
KI PEGTILTMDMLTVKVGEPKGYPPEDI FNLVGKKVLVTVEEDD , 
X IMISEj 


6704 


82 


1007 


TMNTRNRWNSGLGASPASRPTRDPQDPSGRQGELSPVEDQREG 
LEAAPKGPSRESWHAGQRRTSAYTLIAPNINRRNEIQRIAEQE 
LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRI KKEAEEABLQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQYKTAEFL/RQTEHRIARQKCLSKCCLWPTIIjN ! 

RAKIHQTEHRRVNWAFLDRLQGKSQPGGLEQSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RIiCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
S YKRKGG I MS T I AAF YGG KS IL I TVATGFLGKE LMEKLFRTS PD 
LKVIYILVRPKAGQTLQHRVFQILDSKLFEKVIEVRPNVHEKIR 
AI YADLNQNDFAISKEDMQELLS CTNI IFHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLIRDWPNIYTYTK 


6706 


130 


531 


FTHSSSSHSQEMLGKLNMLRNDGHFCDITIRVQDKIFRAHKVVIi 
AACSDFFRTKLVGQAEDENKNVLDliHHVTVTGF I PLLEYAYTAT 
LS INTENI IDVLAAAS YMQMFSVASTCSEFMKS S ILWNTPNSQP 
EK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to, first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, En 
Glutamic Acid, F=Phenyl alanine, G=Glycine / 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKFHFEKKGPPSTCQBRIjYESRSRWPCIS* 
GMVWGWTAVNGS W * GGQLRCV CVCTSHS SDS TRSSQRAS KCHS 
FFILSQ*KT*SSWENWVFAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR*SRFCGLCNPCGHCGI^INLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF+GVARYA 
C*RCHWYFEWLLYNHCGDILVACL+RRQL*SSQ 


6708 

■ . 


115 . 


1729 

- • 


TVGS WSRSGRS PPVGRQLLLTGRGAQAAGS PQGGMALQVELVPT 
GEI IRVVHPHRPCKLALGSDGVRVTMESALTARDRVGVQDFVLL 
ENFTSEAAFIENLRRRFRENLIYTYIGPVLVSVNPYRDLQIYSR 
QHMER YRGVS FY EE P PHLLAVADTVYRALRTERRDQAVM I S VE S 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS SRFGKYMDVQFDFKGAPVGGH I LS YLLEKSRWHQ 
NHGERNFHIFYQLLEGGEEETLRRLGLERNPQSYLYLVKGQCAK 
VSS INDKSDWKVVRKALTVIDFTEDEVEDLLSIAASVLHLGNIH 
FAANEESNAQVTTENQLKYLTRLLSVEGSTLREALTHRKI IAKG 
E E LLS PLNLEQAAYARDAIiAKAVYS RTFTWLVGK INRS LAS KD V 
ES PS WRSTTVLGLLD I YGFE VFQHNS FEQFCINYCJME KLQQLF I 
ELTLKSEQEEYEAEGIAWEPVQYFNNKIICDLVEEKFKGI I\SI - 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 

TAAKMEKKVSKRSRKEEEDLEALIAHFQTLDAKRTQTVEIiPCPP 

PSPRLNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK " 

DTWTKVDI PSPP PRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 

FYHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQLILF 

GGFHESTRDYIYYNDVYAFNLDTFTWSKLSPSGTGPTPRSGOQ\ 

IPSLPRAASSVYGGYSKQRVKKDVDKGTRHSDMF 


6710 


158 

t ■ * • 


980 


RHKMTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATN 
IQAGASFG YQLLWWVWANLMAMLI Q ILSAKLGI ATGKNLAEQI 
RDHYPRPVVWF Y WVQAB I IAMATDLAEFIGAAIGFKLILGVSLL 
QGAVLTGIATFLILMLQRRGQKPLEKVIGGLLLFVAAAYIVELI. 
FSQPNLAQLGKGMVT PSLPTSEAVFLAAGVL \GATIMPHVT / YI 
WHSSLTQHLHGGSRQQRYSATKWDVAIAMTIAGFVNLAIMATAA 
SELNFYGHTGVA 


6711 


3 

* 


347 


VTECKTMTCKMSQLERNI+TMINTLHHYSVKLGHPDTLIHGEFK 
ELVRTDLHNILMKENKNDQAI*HIMEDLOTNAHMQIIFKELIMIj 
MAMLTWSYHDNMHDADYGPGQQHRPG 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VDFFNRINLIYGTMAERCS*TSCPVMAGGPRYEYRWQDERQYRR 
PAKLS APRYMALLMDW IESLI 


6713 


2485 


3 

• 


QARGSDSEDGEFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QRKTIPVILDGKDWAMARTGSGKTACFLLPMFERLKTHSAQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDII IATPGRLVHVAVEMSIiKLQSVEYWFDEADRLFE 
MGFAEQLQEI IARLPGGKQTVLFSATLPKLLVEFARAGLTEPVL 
IRLDVDTKLNEQLKTSFFLVREDTKAAVLLHLLHNVVRPQDQTV 
VFVATKHHAE YLTELLTTQRVS CAH I YSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLDI PLLDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTIjEASLELRGIiARVADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRIiRL 
VDS I KNYRSRATI FEINASSRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
EWGRKRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSERG 
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SEQ 
NO: 


Predicted 
oeg i nn i ng 
nucleotide 
location 
corresponding 
co nrsc 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
ammo aciu 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
\A=Aianine, c=cysceine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L= Leucine, M=Methionine , N=Asparagine, 
p=Froime, Q=Giutamine, R=Arginine , 
S=Serine, T=Threonine , V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








IiSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKIKTESGRYI SS S YKRDLYQKWKQKQKI D * S * L 
GRRRGILTRRRPRTEEVGEARPLAQAGCI PGPHAPRHPLQAESA 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHI PSGGAPAAGAAPMGPQYCVCKVELSVS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
SKKFVLDYHFEEVQKLK^ALFIX5DK5SMR]^EHDFLGQFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLITrAAQELSDNRVITLSLAG 
RRLDKKDLFGKSDPFIjEFYKPGDDGKWMLVHRTEVIKYTLDPVW 

KPFTVPLVSLCDGDME kp iqvmcydydndgghdfigefqts vsq 

MCEARDSVPLEFECINPKKQRKKKNYKNSGI I ILRSCKINRDYS 
FIiDY 1 DGG CQLMFTVG I D FTASNGNP LDPS S LHY I NPMGTNE YL 
SAIWAVGQIIQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNP 
TNPFCSGTOGIAQAYSACLP 


6715 


32 


493 


GPAGAESGSLHCLPATVQALAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHLGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLSIHSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
ABNVTFWKACERFQQl PASDT 


6716 


1 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ 
HTVTLHRVSLCCSK 


o717 


lie* 

115 


896 


LFAMSGFENLNTDFYQTS YS I DDQSQQS YDYGGSGGPYS KQYAG 
YDYSQQGRFVPPDMMQPQQPYTGQIYQPTQAYTPASPQPFYGNN 
FEDEPPIjLEELGINFDHIWQKTLTVLHPLKVADGSIMNETDIiAG 
PMVFCLAFGATLIjLAGKIQFGYVYGISAIGCLGMFCLLNIjMSMT 
GVS FGCVASVLGYCLLPMI LLSSFAVI FSLQGMVG I ILTAG I IG 
WCSFSASKIFISALAMEGQQLLVAYPCALIjYGVFALISVF 


6718 


*\ A 

290 


599 


KQSS TVPGT ILPS LKWHNSGLiCKF PETGG KMTT FKEGLT FKDVA 
VI FTEEELGLLDPVQRNLYQDVMLBNFRNLLSVGHHPFKHDVFL 
LEKEKKLDIMKTATQ 


6719 


1 

* 


691 


PTRPEEQDREDGKCHKMEMNPISGNLNCDPIAMSQCSSDHGCET 
DLDSDDDKIEKPNKFFMKDSASQDNGLSRKISRKRVCSSDSDSSL . 
QVVKKSSld^TGLLRITRRCAATAANklKLMSDVEDVSLEi^ 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDS EGSTKVLSQALNGDSDSEDMLNS EHKHRHTNIHKIDAP S K 
RKSSSVTSSG 


6720 


3 


B22 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VPITEKSNPLTQDLDKADAEWIVRLLGQCDAEIFQEEGQAIiSTY 
QRLYS ES I LTTMVQVAGKVQE VLKEPDGGLWLSGGGTSGRMAF 
LMSVSFNQI^KGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
I EELKKVAAGKKRVI VIG I S VGLS AP FVAGQMDCCMNNTAVFLP 
V uvvj t N F V&MAKHP F PFF K I LKo DTVrroijRAPHYQITbljijr 5M 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
V F 1 1 £ JxbN Fij 1 y UliUKAUAhW 1 VKijLCjyuDAlilrybhljyAijS X x 
QRLYSESILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYLI AGGDRS WASREGTEDSALHG 
IEELKKVAAGKKRVIVIGT^VGL^APFVAGOMDCCMNNTAVFIjP 
VBVGFNPVSMARHPFPPPR I LRSIiTVFPSLRAPHYQI TSLLFSM 
SWTLISE 


6722 


1 


390 


RSWSKRTWQALPMAVLFLLLFLCGTPQAADNMQAIYVALGEAVE 
LP CPS PSTLHGDEHLSWFCS PAAGSFTTLVAQVQVGRPAPDPGK 
PGRE SRLRLLGN Y S LWLEGS KEEDAGR YW CAVLGQHHNYQNW 


6723 


173 


659 


VCQY CTARMAD FG I S AGQFVAWWD KS S P VEALKGLVD KLQ ALT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AE I AR I LRPGGCLFL KE PVE TAVDNNS KVKTAS KLCSALTLS GL 
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SEQ 
ID 
NO: 


Predicted 

Vvocr i nni ner 
nucleotide 
location 
corresponding 
to first 
amino acid ■ 
residue of 
amino acid 
secruence 


Predicted end 

1 nczt Y* i fin 

corresponding 
to first 
amino acid 

rp c "i rfii c* of 

amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=AJLanine, C=Cysteme, D=Aspartic Acid, E= 
uiucamic aciq, J? =rnenyialanane, G=Glycine, 

! H=Histidine, I=Isoleucine / K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 

! P=Proline, Q=Glutamine, R=Arginine, 
o-ociine f i-inrconine i v— vaxine* 
W=Tryptophan, Y»ryrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-poasiDj,e nucicOLiQc inscrcionj 








VEVKELQREPL.TPEEVQSVREHLGHESDNL 


6724 


173 


659 


VCQ YCTARMADFG I SA6QFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKES S FD I ILSGLVPGS TTLHS AB I L 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
v £i v iUi.I_iyK.h< FLi JuriSr* VQS VREHLGHESDNL 


6725 


356 


722 


RRRTPPVI I1ATMDDDLMLALRI1QEEWNLQEAERDHAQES LSLVD 
ASWELVDPTPDLQALFVQFNDQFFWGQLEAVEVKWSVRMTLCAG 
ICSYEGKGGMCSIRLSEPLLKLRPRKDLVEVFFV 


6726 


98 


714 

• 


HLQ KMERK I NRREKE KE YEGKHNS LEDTDQGKNCKSTLMTLNVG 
GYL Y I TQKQTLTKYPDTFLEG I VNGKILCP FDADGHYF I DRDGL 
LFRHVLNFLRUGELLLPEGFRENQLLAQEAEFFQLKGLAEEVKS 
KWEKEQLTPRETTFLEITDNHDRSQGLRIFCNAPDFISKI KSRI 
VIiVSKSRLDGFPEEFSISSNIIQFKYFIK 


6727 


1 


831 


FRGMGDERPHYYGKHGTPQKYDPTFKGPIYNRGCTDIICCVFLL 
IiAI VGYVAVG 1 1 AWTHGDPRKVI YPTDSRGE FCGQKGTKNENKP 
i lit x r JN 1 VKCAS PDVLLEFQCPTPQ I CVEKCPDRYLTYLNARSS 
RDFE YYKQFCVPGFKNNKGVAEVLRDGDCPAVL I PSKPLARRCF ! 
PAIHAYKGVLMVGNETTYEteHGSRKNITDLVEGAKKANGVLEA. 
RQLAMRIFEDYWSWYWDIISLGIAWAMSLLFIILLRFLAGIMG 
RGMI IMGILVLGY - ■ 


6728 


486 


935 


FCSSWLRSLADSSLSWKMFLVGLTGGIASGKSSVIQVFQQLGCA 
VIDVDVMARHWQPGYPAHRRIVEVFGTEVLLENGDINRKVLGD - 
L I FNQPDRRQLLNAITHPE IRKEMMKETFKYFLREPRTSPRGKK 
H VPSAliKEADS LMRRDT 


6729 


259 

* 
« 

r _ ■ - -v : < 

: 


1191 

■ 

r 


VGLTGAQSGRTASMGRDQRAVAGPALRRWLLLGTVTVGFIiAQSV 
liAGVKKFDVPCGGRDCSGGCQCYPBKGGRGQPGPVGPQGYNGPP 
GLQGFPGLQGRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGI 
PGHPGQGGPRGRPGYDGCITGTQGDSGPQGPPGSEGFTGPPGPQG 
PKGQKGE P YALPKEERDRYRGEPGE PGLVGFQGP PGRPGHVGQM - 
(ji* VCjAPCjKPG P PGPPGPKGQQG^3RGIiG F YGVKGE KGDVGQPG PN 
_GI?SDTLHP I IAPTGVTFH PDQYKGEKGS EGE PGIRG ISIiKGEE 
GIM 


6730 


784 

< 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYEVLSNDEKRDI YDKYGTEGLNE F 


6731 


1 


446 


GIRKRIiHGAWPRVEVGCPWETRESEGVHLERPTSPLKNNDEGS 
LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEEGTAKEATY 
NDLQ VE YGKCQLQMKELMKKFKEIQTQNFS LINENQSLKKNI SA 
LIKTARVE1NRKDEE I 


6732 


J. U A 


i one 


GRWQRRPPPP&PPI,WCL<2PGGGSDPQQLTQLRHCIjSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGSIKPSSRSTKATSTTMAGDGRRAEAVREGWGVYVTPRAPIRE 
GRGRLAPQNGGS SDAPAYRTPPS RQGRREVRFSDE PPEVYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
JJ£J\b b y X LUjfa U X 1 S KKT VRS I QEAPA V S EDLVI RLRRP PLR YPR 
YEATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


6733 


613 


1311 


RSCRQVGMRSRNQGGESASDGHISCPKPSIIGNAGEKSLSEDAK 
KKKKSNRKEDDVMAS GTVTCRHLKTSGECERKTKKSLELS KEDL I 
QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLEQL 
IiIjAEKCHRRTVYELENEKHKKTDYMNKSDDFTNLLEQERERLKK 
LLEQEKAYQARKE 


6734 
- 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHFTWEEWQDLDD 
AQRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVEE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=s Alanine . C=Cvsteine DrAsnarf i c Ar-i r? p— 
Glutamic Acid, F=Phenvlalanine. G=Glvcine 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methicnine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
SeSerine, T=Threonine , V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, ./^possible nucleotide deletion, 
\»possible nucleotide insertion) 








TLNLiRLSGGSKKOVFSGICHRSLVEIjOEVHLV 


6735 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QASIiNAGLDIiRLAVQLPPGEDLNDWVAVHVVDFFNRVNLIYGTI 
XDGCT 


6736 


195 


808 


MNYELNFKREMPNIKSLGLTNLNFLLKRLSSVLPLITDYVYFEN 
S S SNTP YL IRRI EELNKTASGNVEAKWCFYRRRD I SNTL IMLAD 
KHAKEIEEESETTVEADLTDKQKHQLKHRELFLSRQYESLPATH 

IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
E I RVGPRYQAD I PEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSRLESYRPDTDLS 
REDTGCNLQHrSDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 

REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAI YYH I KNRD PDGRMIiLD I FDENLH PT^S K<3 F VP Pnvmmrsr 

PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
ANWKKIVLGAILLASKVWDDQAVWNVDYCQILKDITVEDMNE 
RQFLELI^FNINVPSSVYAKYTFDLRSLAEANNLSFPLEPLSRE 
RAHKLEAISRLCEDKYKDIjRRSARKRSASADNLTLPRWS pahs 


: 6738 


148 

» 


653 


CACAEOPARAE VGAATAL P VRWAS GE MAP <3R c; T . A VP T .a VT ,VT .T .T . 
WGAPWTHGRRSNVRVI TDENWRELLEGDWM I EFYAPWCPACQNL 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRCALLAAQ 
ANKE S S S ES F I SPJLT_JX T VADT .YPYVFO VQfJT .Y T Tn/fSTYD u T c a u vx? 

VLAAP^DSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAE I IASHWVSEVEGVNKAL 


6740 


3 


631 

t ■ ■ ■ 


SWPDMAEEEVAKLEKHLMLLRQEYVKZjQKKLAETEKRC^ 
ANKESSSESF1SRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
; VLAARSDSWSLANLSSTKELDLSDANPEVTMTMI^ ; 
REDD VFLTE LMKIJXNP POLA T.T .PP P HP KT3VM ci t .VNTVP MP T V ffvn 

TAEELNASTLMNYCAE I IASHWVSEVEGVNKAL 


6741 


141 


. 9£0 • 


PLTL PFS SRAPAGHTMNTSPGTVGSDPVILATAG YDHTVRFWQA v 
HSG I CTRTVQHQpSQVNALEVTPDRS M IAAAVQP VS LGYQH I RM 
YDLNSNNPNPI ISYDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQRIFQVNAPINCVCIjHPNQAELIVGDQSGAIHIW 
DLKTDHNEQLI PEPEVS ITSAHIDPDAS YMAAVNSTLVPFSCLL 
PLAI G I LOEGE FES LARRGLL FLACOGNC YVWNT .TfiGT GHP VTH 
LIPKTKIP 


6742 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSG I CTRTVQHQDSQVNALEVTPDRSMIAAAVQP VSLGYQH I RM 
YDLNSNNPNP 1 1 S YDGVNKNIAS VGFHEDGRWMYTGGEDCTAR I 
WDLRSRNLQCQRIFQVNAPINCVCXHPNQAELIVGDQSGAIHIW 
DLKTDHNEQLIPEPEVSITSAHIDPDAS YMAAVNSTLVPFSCLL 
PLAIGILQEGEFESLARRGLLFLACQGNCYVWNLTGGIGDEVTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRISISKQLASVK 
ALRKCSDLEKAIATTALIFRNSSDSDGKLEKAIAKDLLQTQFRN 
FAEGQETKP KYRE I LSELDEHTENKLDFEDFM I LLLS I T VMSDL 
LQNIR 


6744 


95 


1343 


RTPARNRCAGCEVLSRFSS PNKAS SFALQSAGGGLPAVRALRRD 
RQKVSTVGYGMDEVEQDQHBARLKELFDSFDTTGTGSLGQEELT 
DLCHMLSLEEVAPVLQQTLLQDNLLGRVHFDQFKEALILILSRT 
LSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VTVIEPLDEEARPSHIPAGDCSEHWKTQRSEEYEAEGQLRFWNP 
DDLNASQSGS S PPQDW I EE KLQEVCEDLG I TRDGHLNRKKLVS I 



546 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 

7 


1 Predicted 
beginning 
nucleotide 

J location 

j corresponding 
to first 
amino acid 
residue of 
amino acid 

j sequence 


Predicted end 
nucl eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

\ n -rtAaiixiic ; WBV.yotcxuC/ JJ — .rV5Jp&x7E.J. C ACXQ, H— 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

P=Proline 0=Gl\lt"amin£» P-2Vr*rri n^e 

S=Serine, T=Threonine , V=Valine, 
VteTryptophan, Y=Tyrosine, X=» Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








*~*- t >£. A vjjj\^w v LyvjijiuujCjCi V C nil ±1 U tr LAj 1 Vlo V Hi Lie c X r l\JNljJ\iijx 

PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVER ILDTWQEBG IENSQE I LKALDFGLDGNINLTEL 
TLALENELLVTKNS IHOACI 


6745 


1 


588 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGLPSSTARQQNNP " 
AAGTECFAAVWAPfiTAMn9VT»<!TT*QfTPfCl&P&Qa'rAP at pdddpd 

ELPVTS FDCAVCLEVLHQPVRTRCGHVFCRSCIATSLKNNKWTC 

PYCRAYLPSEGVPATDVAKRMKSEYTCNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 




492 


SLWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKLADCEKMA 
VEFGNQLEGKWAVLGTLLQEYGLIjQRRLENVENLLRI^RN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDLAQRKIiYRDVMLENFRNLLS VGH " 
QPFHRDTFHFLREEKFWMMD I ATQREGNS vyagvc 


6748 


| 201 


665 


LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
QIEARLSISXVQQXPYRCNECKQ 


6749 


95 


719 


SAGGPCPAAAGGGPGGAS CSVGAPGGVSMFRWLEVLEKE FDKAF 
VDVDLLLGE I DPDQAD I TYEGRQKMTSLSSCFAQLCHKAQS VSQ 
INHKLEAQLVDLKSELTETQAEKWLEKEVHDQLLQLHS IQLQL 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKWVWASGALPRDTTGLGSEQPSGDVAQSNRATWGT " 
TAPGPIHLLELCI)QKLMEFLCNMDNKDLVWLEEIQEFAERMFTR 
EFSKEPELMPKTPSQKNRRKKRRISWQDENFJ)PIRRRLSRRKS 
RSSQLSSRR 


6751 

t . 1 
1 


152 

- ■ " f , 

; 


1417 

* 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCI IQMSGSTTTIV 
i\ Jr ivyjr IvJc* 1 F JUa r i> £ JJ i £> i W bri 1 JV JiDxNxASQKQVYRDIGEEML 
QHAFEG YNVCI FAYGQTGAGKSYTMMGKQEKDQQGI I PQLCEDL 
FSRINDCTITOIWSYSVEVSYMEIYCERVR^ 
HPLLGPWEDLSKiy^VTSYi^IQDLMDSGNKARTVAAT^ 
SRSHAVFNI I FTQKRHDAETNITTEKVSKISLVDIiAGSERADST 
GAKGTRLKEGAN I NKS LTTLGKV I SAlxAEMDSGPNKNKKKKKTD 
FIPYRDSVLTWLLREI^GNSRTAMVAALSPADINxTjETLSTLR • 
YADRAKQIRCNAVINEDPTOKLIRELKDEVTRLRDIxLYAQGI^GD 
ITDMTNALVGMSPSSSLSALSSRNV 


6752 

i 


24 

■ 


1834 


RNCVPPI^CYRSRVKFHSDIKMQYSHHCEHjxLERLNKQREAGFL 
CDCTIVIGEFQFKAHRNVLASFSEYFGAIx^STSEIWVFLDQ^ 

vkadgfqkllefiytgtlnldswnvkeihqaadylkveevvtkc 
ktkmedfafimpssteissitgnielnqqtclltlrdynnrsk 
sevstdliqanpkqgaixakkssqtkkkkkafnspktgqnktvqy 

PSD I IiENAS VE Ij FT .D ftNTCT. PT PWP O VACi T MT1N Q T?T .J?T /P CV\m V 

TFPAQDIVHIVIVKRKRGKSQPNCALKEHSMSNIASV^ 

NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 

CHLCGKAFTQCNQLKTHVRTHTGEKPYKCELCDKGFAQKCQLVF 

HSRMHHGEEKPYKCDVCNLQFATSSNLKIHARKHSGEKPYVCDR 

CGQRFAQASTLTYHVRRHTGEKPYVCDTCGKAFAVSSSLITHSR 

KHTCEKPFICTLCGNSYTDIKl^KKHKTKVHSGADKTLDSSAED 

HTLSEQDS IQKS PLSE TMDVKPSDMTL PLALPLGTEDHHMLLPV 

TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVALKHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSQTPAPEHDKAAWKMPIiAQKPALiAPKPTSQTPPAS 
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SEQ 

JL U 

NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue ui 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acia 
sequence 


Amino acid segment containing signal peptide 
(AsAianine, c,s=cysteme, D=Aspart ic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
w=rryptopnan, i=iyrosine, x= unJcnown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPEPSEPSKEDQESSDRRPPSPP 
GPEERKGQKRDEEEEIATERKPASPPLPATQQEKPSQTPEAGRKE 
KPMLQSRHSLDGSKLTEKVETAQPLWITLALQKQKGFREQQATR 
EERKQAREAKQAEKLSKENVSVSVQPGSSSVSRAGSLHKSTALP 
EEKRPETAVSRLERREQLFCKANTLPTSVTVE I S YSS PAAPLVKE 
VSKKFSSPDDAPVSSEPAWIALAKRKAKAWSDCPLIIK 


6754 

• 


2 


413 


FVRRRRRRLGGPEVNTMSSLHKSRIADFQDVLKEPSIALEKLRE 
LSFSGIPCEGGLRCLCWKILLiNYLPLERASWTSILAKQRELYAQ 
FLREM I IQ PG I AKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVXiL 


6755 


298 


1343 


PGLQLQVALEADWFLDMPGGRRGPSRQQLSRSALPSLQTLVGGG 

CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 

LLFEFLFF I YLLVALF IQY INI YKTVWWYPYNHPASCTSLNFHL 

IDYHLAAFITVML^RLVWALISEATKAGAASMIHYMVLISARL 

VLLTLCGWVLCWTLVNLFRSHSVLNLLFJ^ 

DS RAHLLLTD YNYWQHEAVEES AS TVGGLAKS KD FLS LLLE S L 

KEQFNKATP I PTHSCPLS PDLIRNE VECLKADFNHRIKEVLFNS 

LFSAYYVAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


D / DO 

* 


r™™"~ too 


754 


1 ERALGS LPLS 1 P V SWGS1»RTI^YQQQPLRPKVLLCQTRVQCHD 
LRS LQPQPPGLKQS FCLRVLGLQTGATTPGLRDLTCKEL 1 1 LTE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KALYWDVMLENYRNIiVFLGKDNFALE VKI CPRVFLYFLCCLS WE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLFSHLSAVQ 
TRG I KHRI KWNRKALPSTAQ I TEAQVAENRPGAFI KQGRKLDID 
FGAEGNR Y YEAN YWQ FPDG I HYNG C S EANVTKEAFVTGC INATQ 
AANQGE FQKPDNKLHQQVIi W 


6758 


1 

1 

-r 

* 


1008 


ASGPELPGRRFRDRAPWLPARLLRGVIiAVWVSLSALGPGSFCRR 
RVPSLAQIiGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPS FRRNMANNS PALTGNSQ?QHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMI LTNI LSS PYFKVQLYELK 
TYHEWDE I YFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
yPP lJJJjWUWbh.br JjDUJl.JiUJjUVJsAWjQjL.vMl l^KMJjKSr JjTKLJi 
WFSTLFPRIPVPVQKNIDQQIKTRPRKI 


6759 


1 

• 


513 


RKHNFHS LDGTS TRAFHPQTGL P LLS S PVPQRKTQSGC FDLDSS 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEESVLNYRFDPLGIVTXSFTAEVGaSGAFCPTHLTLPVEVSFY 
SVSDDNAPS P YMGV I TLESLGKRGYRVP PSGTIQWCVL 


6760 




606 


VLSKKKGLSAEEKRTRMMEIFSETKDVFQLKDLEKIAPKEKGIT 
AMSVKEVLQS LVDDGMVDCER I GTSNY YWAFP S KALHARKHKLE 
VLESQLSEGSQKHASLQKSIEKAKIGRCETEERT 


6761 


29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
SSS SVQRCELSLFQSLHTMTS KKLVNS VAGCADDALAGLVACNP 
rnjQIiLQGHRVALRSDLDSLKGRVATJ^GGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGSIIJ^IRAVAQAGWGTLLIVKNYTGD 

IxJJXn C ^UttS\r±\£nr\r\I2i\J X JT V Eil J V V l\JL'L/k3nC X V JJI\Xvn\j£v£^\J JLIV^O X V 

LIHKV7VGALAEAGVGLEEIAKQVWVVTKAMGTLGVSLSSCSVPG 
S KPTFELS ADEVELGLGIHGEAGVRR I KMATADE I VKLMLDHMT 
NTTNASHVPVQPGSSVVMMVNNLGGLSFLELGI IADATVRSLEG 
RGVKIARALVGTFMSALEMPG I S I»TLLLVDEPLLKLIDAETTAA 
AWPOTAAVSITGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
ERVCSTIJjGLEEHIjNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PP PAS PAQLLS KLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 
SLPAWS AAMDAGLEAMQKYGKAAPGDRTMLDS lwaagqel 



548 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 
- nucleotide 

location 

corresponding 
• to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6762 


3 

• 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSliFI 
QVAFITLAVAAGLYYLAELIEB YTVATSRI I KYMI WFSTAVIiIG 
LYVFEEIFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LWVNHYIAFQ FFAEEYYPFS EVLAYFTFCLW 1 1 PFAFFVSLSA 
GENVLPSTMQPGDDWSNYFTKGKRGK 


67^3 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMELGGHWDMNSAPRLVSETAE I 
RKQEQKTGTEAEAADSGAVGARRFLLCLYLGGFLDLFGVSMWP 
LLSLHVKS IiGAS PTVAG I VGSS YG ILQLFS STLVGCWSDWGRR 
S SLLAC I LLS ALGYLLLGAATNVFLFVLARVPAG I FKHTLS IS R 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYIiTELEDGF 
YLTAF I CFLVF I LNAGLVWFFPRREAKPGSTE 


6764 . 


80 

♦ * 


438 


LKKMDTMMLSVRNLFEQLVRRVBILSEGNEVQFIQLAKDFEDFR ! 
KKWQRTDHELGKYKDLLMKAETERS ALDVKLKHARNQ VDVE I KR 
RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL j 
RGS VLVSEALS GSAMDG I VTEVAVG VKRGS DE LL SGS VLS S PNS 
NMS SMWTANGNDS KKF KGEDKMDGAPSRVLH I RKLPGEVTETE 
VIALGLPFGKVTNI LMLKGKNQAFLELATEEAAI TNGNYYSAVT 
PHLRNQ 


6766 

• 


1 

* * - - » 


1287 

• * 

1 - r ' 


EGGSFKASLTWLWPLGEMKLHCEVEVISRHLPALGLRNRGKGVR 
AVLSLCQQTSRSQPPVRAFLLISTLKDKRGTRYELRENIEQFFT 
KFVDEGKATVRLKEPPVDICLSKANSSSLKGFLSAMRLAHRGCN 
VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSKNFPYSLEHL 
QTSYCGLVRVDMRMIjCLKSLRKLDLSHNHIKKLPATIGDLIHLQ 
ELNIiNDNHLESFSVALCHSTLQKSLWSLDLSKNKIKALPVQFCQ 
LQELKNLKLDDNELIQFPCKIGQLINLRFLSAARNKLPFLPSEF 
RNLSLEYLDLFGNTFEQPKN^PVIKLQAPLTLLESSARTILHNR 
IPYGSHII PFHLCQDLDTAKICVCXjRFCIjWSFIQGTTTMNIjHS V 
AHTVVLVD^GGTEAPI IS YFCSLGCYVNSSDI 


6767 

» 


336 r 


919 

•* 


APMICLCSSDLQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFLNV 
GRU2SDNEYKKDFAKSRSQFHSSTDQPGLLQAKRSQQLASDVHY 
RQPLPQPTCDPEQLGLRHAQKAHQLQSDVKYKSDLNLTRGVGWT 
PPGSYKVEMARRAAELANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATE I LHVKKKKALLL 


6768 . 


2 . 


363 


PGSTISCYLLSEGSLPLCMQVACGBEKHRAPTMKTLRARFKKTE 
LRLSPTDLGSCPPCGPCP I PKPAARGRRQSQDWGKSDERLLQAV 
ENNDAPRVAAL I ARKGLVPTKLDPBGKS AFHL 


6769 


284 


396 


MSTPDFS TAENNQE LiANE VS CLKAMLTLMLQAMGQAD 


6770 


1 


397 


QRNYQVIWSSTMAKLHDYYKDEVVKKLMTEFNYNSVMQVPRVEK 
ITLNMGVGEAIADKKIiDNAAADLAAISGQKPLITKARKSVAGF 
KI RQG YP IGCKVTLRGERM WE FFERL IT I AVPR I RDFRGLS AKS 


6771 


3 


3 78 


APAGTLAMTGKSVKDVDRYQAVLANLLLEEDNKFCADCQSKGPR 
WASWNIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCM 
QEMGNG KANRL YEAYLPE T FRR PQ I D P YLFWSNLEG 


6772 


1 


1400 


AAAFLQGMT VNGF INTV I TS L \ ERRYDLHS YQSGLI AS S YD IAA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P**GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPLYTLGVTYLDENVKSSCSP I YIAIFYTAAILGPAAG Y 
LIGGALLNIYTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFFT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 
I RDLPLS I WLLLKNPTFI LLCLAGATE ATL I TGMS T FS PKFLE S 
QFSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVIK 
FCLFCTWSLLGILVFSLHCPSVPMAGVTASYGGSLLPEGHLNL 
TAPCNAACSCQPEHYSPVCX3SDGLMYFSLCHAGCPAATETNVDG 
QKVYRDCS C I PQNLS SGFGHATAGKCTST 



549 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 
S=Serine, T=Threonine, VeValine, 
W=Tryptophan, Y=» Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ; 


6773 


1 

• 


630 


PWEAPECEHKYKAEEHTVVLTVTGEPCHFPFQYHRQLYHKCTHKG 
RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFE PQLLRFFHKNE I W YRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRL 
CHCPVGYTGPFCDVGE*GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFILSS/WVPTFLSMDVDGRVIKADSFSKIISS 
GLRIGFLTGPKPLIERVILHIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TCPSQLRVLTARGGRRAPSPQLWTLVLALI EEKWRSHRI LRMNS 
GRPETMENLPALYT I FQGEVAMVTDYGAF I KI PGCRKQGLVHRT 
HMSSCRVDKPSEIVDVGDKVWVKLIGREMKNDRIKVSLSMKVVN 
QGTGKDLDPNNV\SLS KKRGGGDPSRI TLGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
LNGTFPNTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
TS VAKFVFMAGMMVGGI LGGHLSDRPGRR FVLRWCYLQVAI VGT 
CAALAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAFAIRDWHILQLWSVPYFVIFLTS 
SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 
KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFMA 
YFGLNLHG/ LKHLGNNVFLLQTLFGAV/TP PGQLVLHLGHWGSG 
RVS SRGRVNCLGLFVLQVW 


Sill 

• 


. 779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGI ISCIAFS PAQPLYACGS Y 
GRSLGLYAWDDGSPLALLGGHQGGITHLCFHPDGNRFFSGARKD 
AELLCWDLRQSGYPLWSLGREVTTNQRIYFDLDPTGQFLVSGST 
SGAVSVWDTDGPGNDGKPEPVLSFLPQKDCTNGVSLHPSLPLLG 
HCLPVSVCFLSPTESGGRRRGAGPSLGSPRRHVHLECRLQLWWC 
GGGARLQHP * * SPRARKGR 


6778 


311 


805 

■ . ■- - • . . i 


I QS I TDES RGS I RRKNPANTRLRLNVP \ EETAGDS E / ERS PEEB 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
OTI^yAEKNXKGPXSPVSSEGIKDF^SMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


R^RRQPRLLAANGIEPESMAISEPIKGSRKPCVNKEELALKKP 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHES PPKKKAV 
AWVS AKNPAPMRKKKKVSLGPVS YVLVDS EDGRKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 

4 


. 403 

• 


HE VNDNKPE ININLMS PGKEE IS Y I FEGDPIDTFVALVRVQDKD 
SG LNGE I VCKLHGHGH FKLQKTYENNYLI LTNATLDRE KRSE YS ! 
LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 

SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT [ 

SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 

TPVFINSSSIIQVMKGSQPSTIPAAPLTTNSGLMPPSVAWGPL 

HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 

SPPCTSSPWPSHPPVQQVKEIiNPDEASPQVNTSADQNTLPSSQ 1 

S TTMVS P LLTNS PGSSGNRRSPVS S S KGKGKVDKIGQILLTKAC 

KKVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 

KTPTPPAPTLLKMTSSPVGPGTASAGPSLPGGALPTSVRSIVTT 

LVPSELISAVPTTKSNHGGIASESLAG 


6782 


3 


1327 


RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKS Q VFKNQDP VLPPRPKPGHPLYS KYMLS VPHGI ANED I VSQ 
NPGELSCKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKLIT 
PLDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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SEQ 

ID 
NO: 

V 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino aCla 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding, 
to first 
amino acid 
residue or 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«=<Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, GsGlycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
b=oerme, i = in.reonine, v«vaixne, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DDLNLTSGBIVYLLEKIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELS FSEGEI 1 1 
LKEYVNEE WARGE VRGRTG I F PLNFVE PVED YPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


S YHHHHAQQSAAAS PNLTASQ KTVTTTSMITTKTLPLVLKAATA ! 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLQTSSKVT 
GPGAEAVQ I VAKNTVTLQVQATPPQP I KVPQFI P P PRLTPRPNF 
LPQVRPKPVAQMNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTIj 
PTSQNS IHPVRWNGQTATIAKTFPMAQLTS IVI ATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEE IQSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYMAVLGFGALTPTSPQS 
SHPDS PENEKTETTFTFPAP VQPVSLPS PTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMW1CPRCQDQP4LK 
KEEAI PW PGTLAI VHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS ISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6784 


3 

* 

i 

--^ * 

i 


1750 

• 


S YHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA 

TM PAS WGQRPT I AMVTAI NSQKAVLS TDVQNTP VNLQTS S KVT 

GPGAEAVQ I VAKNTVTLQVQATPPQP I KVPQF I PP PRLTPRPNF 

LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL- 

PTSQNS IHPVRWNGQTATIAKTFPMAQLTS IV! ATPGTRLAGP 

QWQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 

PQKIAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 

PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 

SHPDSPENEKTBTTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 

RKSGQLLMCOTCSRVYHIJjCLDPPLKT I PKGMWI CPRCQDQMLK 

KEEAIPWPGTLAIVHSYIAYKA^EEKQKLLKWSSDL 

LEQKVKQLSNSISKCMEHyiKNTILARQKEMHSSLEKVKQLIRLIH 

GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 

ANCNQGEETK 


6785 


1 


528 


IX5NTVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVLAPTRELANHVSRDFKDI \TRKLTVARFYGGTSYQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQML 
DLGFAEQVEDI IHES YKTDS EDNPQTLLFSATCPQWVYTVA\ KK 
YMKSRYEQVDLDGKMTQKAATTVEHLA IQCHWSQRPAVIGDVLQ 
VYSGS EGRAI I FCETKKNVTEMAMNPH I KQNAQCLHGD I AQS QR 

al. LLjiSSjC KUbor 1\ V JjvAliN VAAK^ljlJlr'JoVUijVXv^^* * «^ v ^^ 

YIHRSGRTGRAGRTGICICFYQPRERGQLRYVEQKAGITFKRVG 
VPSTMDLVKSKSMDAIRSLASVSYAAVDFFRPSAQRLIEEKGAV 

njVT AAIVT atTTCfiaCQT7T?DDCT.T r PC;nW'r!T?\7'PM , rT.R , QT.T?'C?Tnr)VQP 
i lfA ] ihhhi im m x ourtj o I; c«Jrx\oJjx i &UJS*yjc v ii v ix JLm^iJiiiix^iJ vol 

AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 

WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 

RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 

FD*VFYHLVDFLSDFLVDSVYLTGRQIDHLTGLTGLIDHLTSHS 

SVWN 


6787 


2646 


2270 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF 
FELGSPSGVISAHCNLRLLGSSDS PAPASRVAGI IGTCHHAWLI 
LVFLVEMGFHHVGQAGLKLLTL\VIHPPWPPKVLGLQT 


6788 


16 


936 


GGTVDLR \ DMLAVS VLAAVRGGR/ ATVRRVRESNVLHEKS KGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD\ELDQ 
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XV 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ox 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptopnan, Y=Tyrosme, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EVILWGS * DS * GYPKGK* LLPKEVPS R / RVUjSGLTPLDATQE \ 
FTEDLSK\ YVTTMVCVAVNGKPMDGVI HKP FSEYTAWAMVDGGS 
NVKARSS YNEKTPR I WSRSHSGMVKQVALQTPGNQTTI I PAGG 
AGYKVLALIiDVPDKSQEKADLY IHVTYI KKWD I CAGNAI LKALG 
GHMTTLSGEEI S YTGSDGIEGGLLAS IRMNHQALVRKLPDLEKT 
GHK 


6789 


2 


678 


GNGINVLKIAPESAIKFMAYEQI KRLVW* * PGDS*GF/ YERIiVA 
GSLAGAIAQSS I YPMEVLKTRMAIiRKTGQYSGMLDCARRILARE 
GVAAFYKG YVPNMLGI I P YAG I DLAVYETLKNAWLQHYAVNSAD 
PGVFVLLACGTMSSTCGQIAS YPLALVRTRMQAQAS IEGAPEVT 
MSSLFKHILRTEGAFGLYRGLAPNFMKVI PAVS I S YWYENLKI 
TLGVQSR 


6790 

• 


2 

* 


4068 

■ * 


APPAGRRRMQAAPRAGCGAALLLWI VS SCLCRAWTAPS TSQKCD 
EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGNIWAFPGNINSDGWRHELQHPI IARYVRIVPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHVVLPYRFRNKKMKTLKDVI 
AI^FKTSESEGVILHGEGQQGDYITLELKKAKLVLSLNLGSNQIj 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 
TNGEFDYLDLDYEITFGGIPFSGKPSSSSRKNFKGCMESINYNG 
WITDLARRKKLEPSNVGNLSFSCVEPYTVPVFFNATSYLEVPG 
RLNQDLFSVSFQFRTWNPNGLLVFSHFADNLGNVEIDLTESKVG 
VHINITQTKMSQIDISSGSGLNDGQWHEVRFLAKENFAILTIDG 
DEASAVRTNSPLQVKTGEKYFFGGFLNQMNNSSHSVLQPSFQGC 
MQLIQVDDQLVNLYEVAQRKPGS FANVS IDMCAI IDRCVPNHCE 
HGGKCSQTWDSPKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 
SNYYWIDPIX3SGPLGPLKVYCNMTEDKWTIVSHDLQMQTPVVG 
YNPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRIili 
NTPDGSPYTWWVGKANEKHYY^GGSGPGIQKCACGIERNCTDPK 
YYCNOTADYKQWRKDAGi^SY^ 

SVGPLRCQGDRNYVmAASFPNPSSYLHFSTFQGETSADISFYFK 
TLTPWGVFLENMGKEDFIKLELKSATEVSFSFDVGNGPVEIWR 
SPTPLNDDQWHRVTAERNVKQASLQVDRLPQQIRKAPTEGHTRL 
ELYSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDIiAQ 
EE IRFS FS TTKAPC I LLYISS FTTDFLAVLVKPTGSLQ I RYNLG 
GTREP YNIDVDHRNMANGQPHS VNI TRHEKTI FLKLDHYPS VS Y 
HLPSSSDTLFNSPKSLFLGKVIETGKIDQEIHKYNTPGFTGCLS 
KVUc NUi. AFLiJ\AAJjKU 1 WAHArl V nlUoHij V JloWLXiAbi'ijiijiaFW 
S SATDPWHLDHLDSASADFP YNPGQGQAIRNGVNRNSAI IGGVI 
A\ WI FTPS LCTP \ VLP * SR*HVS PHKGTLP I PNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYLAMG 


6791 


1801 


1193 


TGHEGAKGEKGDKGDLGPRGERGQHGPKGEKGYPGIPPEIj/PGW 
SAW*SWLTAASTKVQAILLPQPLE*LGLQXAFMASLATHFSNQ 
NSG 1 1 FSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
E E VY VYLMHNGNTVFS MYS YEM KG KS DTS SNHAVLKLAKGDE VW 
T .P MGMGALHGDHOP FS TFAGFT.L FE TIC 


6792 


33 


1073 


VRHTNWGVDM YLFSLGSESPKGAIGHI VSTBKT I LAVERNKVLL 
PPLWNRTFSWGFDDFSCCLGSYGSDKVLMTFENLAAWGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCLAAS VTFS LLVSGSQDCTCILWDLDHLTHVTRLPAHREG I SA 
ITI SDVSGT I VS CAGAHIiSLWNVNGQPLAS I TTAWGPEGAITCC 
CLMEGPAWDTSQI I ITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGIiESRAGR*HCFDREAQQNQP\PVTAL 
AVSRNHTKLLVGDERGRI FCWSADG* EERGSRGSGTTVPG 
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Amino acid segment containing signal peptide 
(A-Aianme, c=cystei.ne, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6793 


2340 


! 805 

i • 


GRKEANY \ YGSLTQAGTVSU3LDAEGQEVFVP FSAVLPMVAPND 
LVFDGWD I S SLNLAEAMRRAKVLDWGLQEQLWPHMEALRPRPS V 
YI PEFIAANQSARADNLI PGSRAQQLEQIRRDIRDFRSSAGLDK 
VI VLWTAInTERFCEVI PGLNDTAENLLRT I EIiGLEVS PSTLFAV 
AS I LEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKS VLVD PL I G S GLKTMS I VS YNHLGNNDGENLS APLQ FRS KEV 

SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRALDE 
YTSELMLGGTNTLVLHNTCEDSLLAAPIMLDLALLTELCQRVSF 
CTDMDPEPQTFHPVLSLLSFLFKAPLVPPGSPWNALFRQRSCI 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRI»FLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 * 


J- D J 

* 




CD VKRKPEASAH * EKPGP PSRPG VRGGRERAGGRGSHGARS CR \ 
EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
IAKAEIiDGT I LKSRPLR IRFATHGAALTVKNLS PWSNELLEQA 
FSQFGPVEKAVWVDDRGRATOKGFVEFAAKPPARKALERCGDG 
AFLLTTTPRPVI VEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
K r J\\2 Ftj ± blir c. i AS K W KAliDh MkKQ QREQVDRN I REAKEKLEAE • 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR - 


6795 


1740 s 


1010 


GPRRQTQVRDHELDSF*DWAAQETDCAQNSGERL* KGV/LENFS J 
TMS KS AVKI SIiDLLSNPLCEQDQDLLl^rrALDTAMKRMDAFNQ " 
EKVWQ I QKT VI E PLKK F GS V F PS LNMA VKRRE QALQDYRRLQAK 
VEK YE EKEKTG P VLAKLHQAREELRP VREDFEAKNRQLLE EM PR 
FYGSRLDYFQPS FESLIRAQWYYSEMHKI FGDLSHQLDQPGHS 
DEQRERENEAKLSELRALS IVADD 


6796 


48 

- a - • .. * t , i^l . »« ■ -» • - 1 
»■* S 


683 


GKEIQI PTI KLAWLLFGLE * P VGALGKGWS F* * SHVALGQLGW 
LTRAVRSSWRWELCVSAQ_EWSQRSA*SSPSPVGACPSLNPPET. , 
SVQEGRDCWQR*LPRLFSMjVGQPGC^PQGAPPER^ 
HLQSQVLR*ERRRCCRCLPRFA*GWRRRHQRLGLGIHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6757 ; 


1620 

1 
J 


211 

- 


TERMTPSQPTRGSSCTRFSSMLWTSTWRCLTCHWAGMRMSWGV 

TLGPMAQGLLSASGTTTEATWTRPTTHLTIjIRWWLLTASRVDPP 

b Re P v P PS DDLTLLES SS S Y KN It/ DAQ I PQ / D WS MSPSTSG*RP 

LTSRASS IMRSRTAIPSAS *SRLTTKHTVGGSPSAWRPRPTSRS 

VSTPVSSSTETTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 

TSLWGSCX3GSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 

RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 

PHCGWPCPASCASAAAWLSSTWATASVAGSCWGPIM*SSAHSPW 

CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 

ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 

APPPASSAAGAARPAAFSAAAEGTPRRS IRCW 


6798 

■ 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQS PQE WEALQALTYLGDRVS EKVKTKV IELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDE E KS KL LAKLL KS KNPDDLOEANKL I KS MVREDE AR T OKV 

TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDlIiQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSIiSSVIiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQTuPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVEPAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPIiFQ 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=s 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PssProlme, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLVWVSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTBLSPFSPIQPPAAITQVMLIJ^PLKEKVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGS PRLSALECVLLVPQ\ PQ1A 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKIKDAYHMIiKRQGIVQSDPPIPVDRTLlPSPPPRPKNP 
VFDDE E KS KLLAKLLRS KNPDDLQEANKL I KS MVREDEARI QKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDI LQASDNLSRVINS YKTI I EG 
QVINGEVATLI^PDSEGNSQCSNQGTLIDLAELDTTNSLSSVIiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APS AGS S LFSTGVAPALAPKVE PAVPGHHGLALGNSALHHLDAL 
DQLLEE AKVTSGLVKPTTS PL I PTTTPARPLLPFSTGPGS PLFQ 
PLS FQSQGSPPKGPELSLAS IHVPLES IKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQP PSGTELS PFS P IQP PAAI TQVMLLANPLKE KVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6800 

( ■ ' 


404 


1646 

• 


RRSPSTGLSPVPQPSSPSLSDYSIPWSLLLSGTIAWATPGK*AG 
* PQAW* LGLAPAIAFI /GLTRGRKQNKEKM^GGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKS I VWYPPWAR IGTEAGTRARARARA 
RATRARRAVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LE 
AALIALGNNAAYAFNRDI IRDLGGLPIVAKILNTRDPI VKEKAL 
IVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNE YQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQyPSSLG\SLFNKKENKEyiLK3^yiFENI 
FKWEENEPTQNQFGEGSLFFFLTCEFQVCADKVIXSIESra 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 

t 
k 


SAE E FE S QQAS VTMHDVDAE S FE VL VDYC YTGRVSLS EANVERL 
YAASDMLQLEYVREACAS FLARRLDLTNCTAI LKFADAFGHRKL 
RSQAQS Y IAQNFKQLSHMGS IREETLADLTLAQLIAVLRLDSLD 
VESEQTVCHVAVQWLEAAPKERGPSAAEVFKCVRWMHFTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/R*QQQLSCICSRKSTPETGYVCQGDGDLLWTPQRSLS\RYDPY 
SGDIYTMPSPLTSFAHTKIVTSSAVCVSPDHDIYLAAQPRKDLW 
VYKPAQNSWQQLADRLLCREGMDVAYLNGYIYILGGRDPITGVK 
LKEVEC YS VQRNQWALVAPVPHS FYSFEL I WQNYLYAVNSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDE I YCICD I PVMKVYN 
PARGEWRRI SNI PLDSETHNYQ I VNHDQKLLL ITSTTPQWKKNR 
VTVYEYDTREDQW IN IGTMLGLLQ FDSGF I CLCARVYPS CLEPG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRKAQDQQGSL 


6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 
PSTRKNLMNSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 

RDRLQREEKEKERLNEELHELKEENKLLKGKNTLANKEKEHYEC 
EI KRLNKALQDALNI KCS FSEDCLRKSRVEFCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCIAPPPVCCQAG/PR 
TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRER 
G 


6803 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLALDN 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


* 

• 






1 KM I VEMLRTDLS YL CS RWRMTGQ PT I TF P I SHS MLDEDGTSLNS 
S ILAALRKMQDGYFGGARVQTGKLSEFLTTSCCTHLSFMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKIJ^TSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASRPSFNLLDSPHPRQENQVPSVRVE IHLPRD 
QSGEVDFKALVLQLKETSSLQEQADILYMLYTMKGPDWNTELYN 
ERSATVRELLTELYGKVGEIRHWGLIRYISGILRKKVEALDEAC 
TDLLSHQKHLTVGLPPEPREKTI SAPLP YEALTQLIDEAS EGDM 
SISILTQEIMVYLAMYMRTQPGLFAEMFRLRIGLIIQVMATELA 
HSLRCS AEEATEGLMNLS PSAMKNLLHH I LSGKEFGVERK/ SVR 
PTDSNVSPAISIHEIGAVGATKTERTGIMQLKSEIKQVEFRRLS 
ISAESQSPGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCKGLSVEGFVLPSSTTREMTPGE I K 
FSVHVES \VLNVLLRPEYRQLLVEAILVLTMLADIEIHS IGS 1 1 
AVEKI VHI ANDLFLQEQKTLGP \ DDTMLAKD PASG \ I CTLR\ YD 
SAPSGRFGTMTYLS \RAA\ATYVQEFLP\HS ICAMQ 


6804 


1 


951 


GSPGKKEEKAKNKBSLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES' 
LQTVAEEESCSPSVELEKPPPVNVDSKPIEEKTVEVNDRKAEFP 
SSGSNFSA* IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN _ 
QEEVRS I KS ETDST I EVDS VAGELQDLQSERE * LASRF * CQCEL : 
KQ**SARTRTS*KSLYRSEKSERCSGRRKFIKKAEKKP*SNSGK : 
QQKEGKRHK 


6805 

* 

■ 
r 

* • 


1539 


206 


rqpdlkyfgksfdvsvsesssllsndlpkfadgikarnrnqljlyl 

vpspvlrildhtafsteksadivicdeecdspesvnqqtqeesp 

i evhtaed vp iavevhai s ed yd i etennsseslqdqtdeeppa " 

klckildksqalnvtaqqkwpllranssglykcelcefnskyfs 

dlkqhmilkhkrtdsotc^vckesfstnm^ 

ckycdyktvifenlsqhiadthfsdhrjywceqcdvqfssss^ijy 

lhfqehscdeqylcqfcehetndpedlhshvvnehacklielsd 

kynngehgqysllskitfdkcknffvcqvcgfrsrlhtnvnrhv 

aiehtkifphvctdcx;kgfssmle\iakhlnshlsegiylcqyw 

eystgqiedlkihldfkhsadlphkcsdclmrfgnerelishlp 

VHETT 


6806 

■ 


272 


3794 


VALCFPNSDPVMFMDAFYGCLiLAELGPVPIEVPIiTRKDAGSQQV 
GFLLGSCXJVFLALTTDACQKGLPKAQTGEVAAFKGWPPLSWLVI 
DGKHLAKPPKDWHPLAQDTGTGTAYIEYKTSKEGSTVGVTVSHA 
S LLAQCRALTQACGYS EAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHWS VPYALMKANPLS W I QKVCFYKARAALVKS RDMHWSLLAQ 
RGQRDVSLS SLRMLIVADGANPWS I SS CDAFLNVFQSRGIiRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLSVLTVQDVGQVMPGANVCVVKLEGTPYLCKTDEVGEICV 
S SS ATGTAYYGLLG ITKNVFEAVPVTTGGAP I FDRP FTRTGLLG 
F IGPDHLVF I VGKLDGIiMVTGVRRHNADDVVATAIiAVE PMKFVY 
RGRI AVFS VTVLHDDRI VLVAEQRPDASEEDS FQWMSRVLQAI D 
S IHQVGVYCLALVPANTLPKAPLGGIHI SETKQRFLEGTLHPCN 
vlm c phtcvtnlp kprqkqpe vg p asm i vgniivagkr iaqas gr 
ELAHLEDSDQARKFLFLADVLQWRAHTTPDHPLFLIjLNAKGTVT 
STATCVQLHKRAERVAAALME KGRLS VGDHVALVYP PGVDLI AA 
FYGpLYCGCVPVTVRPPHPQNLGTTLPTVKMIVEVS KSACVLTT 
QAVTRLLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVLAYLDFSVSTTG ILAGVKMSHAATSALCRS I KLQCELYPSRQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVSQYKARVTFCCYSVMEMCTKGLGAQTGVLRMKGVNLSCVRTC 
MWAEERP\R IALTQSFS KLFKDLGLPARAVSTTFGCRVNVAI C 
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SEQ 
ID 
NO: 

< 


Predicted 

ViofT i nfi -J nrr 

nucleotide 
location 
cor re spondi ng 
to first 
amino acid 

rPRi Hup e\f 

amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide 
iA-Aj.an.Lne, t^uysceme, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=:Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V»Valine, 
w»irypt.opnan, x = iyrosine, X=Urix.nown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRALRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVI IAHTETKGPLGDSHLGE I WVSS PHNATGYYTVYGEEAL 
HADHFSARLS FGDTQT I WARTGYLGFLRRTELTDASGGRHDALY 
WGSLDETLELRGMRYHPIDIETSVIRAHRSIAECAVFTWTNLI* 
VVVVELDGLEQDALDLVALVTNVVLEEHYLWGVWI 
INSRGEKQRMHLRDGFLADQLDPIYVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG 
. S VTQAGPQLQALANLEARRGS IGAALSSRDVSGLPVYAQSGE PR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 

RFPCGMEVHSGQRELESWAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLS CPLQVQ 


6808 


2063 


737 


GVGSGAASAIARSRPLASRJ^SRRRTRAPRSGAMQRLAMDLRML 
SRELSLYLEHQVRVGF FGSGVGLSL I LGFS VAYAFYYLSS IAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPF\ 
ITSKPPVQYRNELIKTADGGQISLDWFDNDNSTCYMDASTRPTI 
LLLPGLTGTSKESYILHMIHLSEELGYRCVVFlOniGVAGENIiLT 
PRT YCCANTEDLETVI HHVHSL YPSAPFLAAGVSMGGMLLLNYL 
GKIGSKTPLMAAATFSVGWOTFACSESLEKPLNWLLFNYYLTTC 
LQSSVNKHRHMFVKQVDMDHVMKAKS IREFDKRFTSVMFGYQTI 
DD YYTDAS PS PRLKS VG I P VLCLNS VDDVFS PSHAIP I ETARQN 
PNVAL VLTS YGGHIGFLEG I WPRQS TYMDRVFKQ FVQAMVEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 

TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 

GQFGKILDVEIIFNERGSKGFGFVTFETSSDADRAREKLNGTIV 

EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 

VTGFPYPTOGTAVATOGAHLRGRGRAVYNTFRAAPPPPP I PTYG . 

AVVYQDG FYGAE I \LEATQPTDTLS PLQRRQPTATVTAES TQLP 

TRTITPSGPRRPTALEPCETFHRFLLGP 


6810 

■ 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGFVTFETSSDADRAREKLNGTI V 
EGRKI EVNNATARVMTNKKTGNP YTNGWKLNPVVGAVYGPE FYA 
VTGFP YPTTGTAVAYRGAHLRGRGRAVYNTFRAAP PPP P I PTYG 
AVVYQDGFYGAE I \LEATQPTL7TLS PLQRRQPTATVTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSVVAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
SVGQDTQLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMLX3AIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTLGTPLCPRMEDVPLLEPLICKKIAHERLTVLIFLEDCI 
VTACQEGFI CTWGRPGKWS FNP 


6812 

• 


4001 


1682 

• 


EDAVFSLDLSTI IQGTWFLNGEELKSNEPEGQVEPGAIiR YRIEQ 
KGLQHRLILHAVKHQDSGALVGFSCPGVQDSAALTIQESPVHIL 

LJr yl/K V UUl ill IV V VLI1 V.DiJJi\ VUl lAl r* X iVUVJyiVV Pf T" ~ n * ' " ' 

WKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
H I VDPREHVF VHAI TSE C VMLACEV \ DR \ EDAPVRW YKDGQE VE 
ESDFVVLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
EWES PALLLQKEDTVRRLVLPAVQLEDSGEYLCE IDDESASFT 
VTVTEP PVR I IYPRDEVTL IAVTIiECWLMCELSREDAPVRWYK 
DGLEVE ESEALVLERDGPRCRLVLPAAQ P EDGGEFVCDAGDDSA 
FFTVTVTEPPVQFIJUjETTPSPLCVAPGEPVVLSCELSRAGAPV 
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SEQ 
ID 
NO: 


Predicted 

beer Inn 4 ncr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nin"**t pnf" \ Hp* 

location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 

U W 11 w w 


Amino acid segment containing signal peptide 

inaniaiiAiic, w-v»yoLcj,ijc f U-nSpatulC nCJLU f Hi— 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine / I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, +=Stop 

v.uuyu f » D j^uaaxwj,c jiuuicutiuc uciCLjLvjn^ 

\=possible nucleotide insertion) 








VWSHNGRPVQEGEGLELHASGPRRVIjCIQAAGPAHAGLYTCQSG 
AAPG AP S L S FTVQVAE P PVR WAP EAAQTR VRS TPGG DLE LWK 
LSGPGGPVRWYKDGERLASQGRVQLEQAGARQVLRVQGARSGDA 
GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATPR 
CEVSPPDADVTWLRNGAWTPGPQRQSCCSYGGCRMCGQRKART 

MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 

• 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCAIiVTSSGHL 
LHS RQGSQ I DQTE CV I RMNDAPTRG YGRD VGNRTSLRV I AHS S I 
QRI LRNRHDLLNVSQGTVYI FWGPSS YMRRDGKGQVYNNLHLI^ 
QVLPRLKAFM1TRHKMLQFDELFKQETGQ\NRKISNTWLSTGWF 

TMT T AT .FT »PTYP TMUVfiMfi P PfYFPP HPMHP C VP VU WF P CnDnPf 

TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 




737 


K FP PO EAN / AP F P MR MW5T .MD AT T)MT .P TCW P P YQ TfTTJ Tf T .G Tf T PT 

I^LAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARSFLMGQGGEAAHHTRSPYSTFYPPYHSPELTTPPGHG 
TLDNSKSMKPYNYCSAYESFYESTSPECASPQFEGPLSPPPINY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQGAMFRLPTD 
SHFPYDLHLRSQSIjTMQDELNAVFHN 


6815 

* 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE . 

nVf^PT*a VWNf5TVZ QfiQT.PP QriQTTTJK'T.P'nZiT.DPQ CTrMT.I/KTrT ntiriT 

PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 

* 


1 


803 


NLLKTHKF\TiT*GQDEDSLHSVPVAQMGNYQEYLKTLASPLREID 
PDQPrOtfjHTroNPFKQDKKGhWIDEA^ 

SPMSSKRRRSMSLLi^KPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDAT 1 1 HDGHEEKMENGQ I TPDGFLS KS APSEL I NM 
TGDLMPPNQVDS LSDDFTSLS KDGL IQKPGSNAFVGGAKNCS LS 
. VPP£rO)PyMTLG^ 

GRSK ' \ 


6817 


172 . 


3457 

• 


LGMMDS PKIGNGLP VIGPGTDIGI S SLHMVG YLGKNFDSAKVPS 

DEYCPACKEKGKLKALKTYRISFQESIFLCEDLQCIYPLGSKSL 

NNLISPDLEECHTPHKPQrDiKSLESSYKjDSLLIANSkKTRNYIA 

IDGGKVIxNSxvHNGEVYDETSSNLPDSSGQQNPIRTADSLERNEI 

I/EADTVDMATTKDPATVDVSGTGRPSPQNEGCTSKLEMPLESKC 

TSFPQALCVQVHCNAYALCWLDCI LSALVHSEELKNTVTGLCS KE 

ESIFWRT^TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEIET 

CLNEVRDEIFISLQPQLRCTLGDMESPVFAFPLLLKLETHIEKL 

FLYSFSWDFECSQOTHQYQNRHMKSLVTFTNVIPEWHPLl^^ 

GPCNNCNS KSQ I RKMVLE KVS P I FMLHFVEGLPQNDLQHYAFHF 

EGCLYQITSVIQYRANNHFITWILDADGSWLECDDLKGPCSERH 

KKFEVPASEIKIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 

KPVSLTS CSVGDAASAETAS VTHP KDI S VAPRTLSQDTAVTHGD 

HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 

AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 

VVNTNMQSVQLNTEDTVNTKSVNNTDATGLIQGVKSVEIEKDAQ 

T.^nFT.TPTfTFAT.TfPRPVT < ?nVQT^.KTfTn?T^Anc:nTTT*QK'ClT/ , )Mn 

S LKENQKKP FVGS WVKGL I SRGASFMPLCVSAHNRNTITDLQ PS 
VIvGVlWFGGFKTKGINQIvASjHVSKKARKSASKPPPISK^ 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGQIHKLRLrCLRKIvL 

QVPQDGSPNl^CESIEDLxxNELPx^IDIANESACTTVPGVSLYSS 
QxTxEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
E FNDVSQNTHLRQDHNYCS P TKKNPCE VQPDS LTNNACVRTLNL 
ESPMKTDI FDEFFSSSALNAIxANDTLDLPHFDEYLFENY 


6818 


2 


240 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

( ArzAlanine . C = Cvs te.inp D=7\qn3r1"i c Ar 1 ! ^ ~W— 
Glutamic Acid. FsPhenvl aT «rin ri^fll "Vf—i np 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ ^possible nucleotide insertion) 








DGGE/LHS /ATTEHKP/VnATPVWTiTVTTTiT^TWnZiPT.PnT 


6819 


1 


961 


GIPCTEMGNFDNAOTTGEIEFAIHYCFKTHSLEICIKACKNlxAY 
GSEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTPQETLK 
YQVAPAQLVTOQLQVS VWHLGTLARRVPLGEVI I PLATWDFEDS 
TTQSFRWHPLPAKADKYEDSVPQSNGELTVRAKLVLPSRPRKIiQ 
EAQEGTDQPSLHGQLCLVVLGAKNLPVRPTCTLNSFVKGCLTIiP 

rWViTTT .RT .Tf <5 PVT .P Vd A P PTIMTTW Q FVPQfTVT P lOT .P n Q Q T »PT .m/W 

JUT^/\^i\JLJ X\Xj.lvO IT V LXE\<VS£n^Ir^nXuL0 T V C uO V X XW>^LlK^OOjJE«Jj X V W 

DQALFGMTORL1^GT\RLGSKGDTAVGGDACSQSKLQWQKVLSS 
PNLWTDMTLVLH ' 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKVVRKHHRVIA 
GQFFGHHHTDS FRMLYDDAGVP I SAMFITPGVTPWKTTLPGWN 

d ANNP A T PV PP VT)P ATT . Q T ,KT>MVT VPMItfT.Q H ANA r*V3TPP WFT ,PV 

QLTEAYG VPDASAHSMHTVLDR I AGDQS TLQRYYVYNSVS YSAG 

VCDEACSMQHVCAMRQVDIDAYTTCLYASGTTPTO 

LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
R FCVT^TLD PETLPA I ATTL I DVL F Y QHST P KRA A Q Q Q PP P Q Q TT 

FFAFSLIEGYI\SIVMDAETOKKFPSDIxlxLTSSSGELWRMVRlG 
GQPx^FDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 


—f -1. Kj 


F.FDTYR /RVnRRRVPVTPnnL^SNfJFPPTnWfSPQPTl/WPTriQDOJVr 

kFC^TlxDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFS1jIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPI^FDECGIVAQITVGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6823 


654 


221 


PPKLLSRWARMGHGDEIV\LSD1^FPGLLHLPVVGPWRSVQTAC 

d T PHT .T .F A17T .TT T .T .PT X\T V"\7P Q P TV 21 VMP T ATP QT» TTPd m .f"VT P"\ /TUTT? 
wX x^^xjxjci/1 v xjJ\xjxjirxjxy l x V Xloxr/irtVrlxliljVr olJr»JjK.oJjy x xr VW in 

YESILRRAGOn^ALAKIERFEFYERAKKAFAWATGET 
ILRKGVLALNPLL , ..* _ r 


6824 


858 . 


104 


LLLAQR WG WG \ CCFFS LAVS VKMNVLL FAPGLL FLLLTQFGFJRG 
ALPKLGICAGLQVVLGLPFLLF^PSGYLSRSFDLGRQFLFHWTV 
NWRFLPEALFLHRAFHlxALLTAHLTLLLLFALCRWHRTGESILS 
lxLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
WYFHTLPYLLWAMPARWLTHLLRLLVLGLIELSWNTYPSTSCSS 
AALHICHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 




OOVjJLr VJljyi-loJJ XrJrV X ± oLf 1 \3r1 J.XJX X Xj o XJrlCi XT V»>lXA3i^.\— X r VrU_iLi 

PKFDPLVI LKTLS S YP I KSMMGAP I VYRMLLQQDLS S YKFPHLQ 
NCLAGGESLLPETLENWRAQTGLD1REFYGQTETGLTCMVSKTM 
KIKPG YMGTAAS CYDVQ I IDDKGNVLPPGTEGDIG IRVKPIRPI 
GIFSGYVDNPDKTAAN IRGDFWLLGDRG IKDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVR1^PJ3IHLDSPLLSLSF 
P FGPlxALPMDGYGDSLWEFJJEYKFCIiALVISTKLYHVRC 


6826 


2304 

• 


954 


LKTESFKPW/VNIALAFHLLGERASPNSFWQPYIQTLPREYDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FTYEDYRWAVSS VMTRQNQ I PTEDGSRVTLALIPLW 
DMCx^TNGLITTGYNLEDDRCECVALQDFRAGEQI YI FYGTRSN 
AEFVIHSGFFFD™SHDRVKIKx^VSKSDRLYAM1^VIxARAGI 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RIFTLGNSEFPVSWDMEVKLWTFLEDRASLLLKTYKTTIEEDKS 
VLKNHDLS VRAKMAI KLRLGEK2ILEKAVKSAAVNRE Y YRQQME 
EKAPLPKYEl^NLGLLESSVGDSRLPLVLRx^EEAGVQDALNI 
REAIS KAKATENGLVNGENS I PNGTRS ENESLNQES KRAVEDAK 
GSSSDSTAGVKE 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=» Phenyl alanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=spossible nucleotide insertion) 


6827 


l 


779 


SSVVEPGLSVLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
ETRNLDPENGSGMALQPIiQAAPEPGAQGQREKNSQHPPALAPPG 
HQGHSHGHQGGTDITWMVLLGDGLHNLTDGLAIGAAFSDGFSSG 
LSTTIAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALGL 
GGAVLGVGLSLGPVPLTPWVFGVTAGVFLYVALVDMLPALFPSS 
GAPAYA\HVLLQGLGIiLLGGCLMLAITLLEERLLPVTTEG 


6828 

* 


3 


1654 

■ 


KSQHG/ W I LQLMHS CKEGYVKDLKGNPGIiHRAMLDLDNGTRPSE 
LGHLSQTAS LKRGSS FQSGRDDTWRYKTPHRVAFVEKLTKLVLS 
QLPNBWKLWISYVNGSLFSETAEKSGQIBRSKNVRQRQNDFKKM 
I QEVMH S LVKLTRGALLPLS I RDGEAKQYGG WEVKCELSGQWLA 
HAI QTVRLTHESLTALE I PNDLLQTIQDLILDLRVRCVMATLQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQEEVCQLSINIMQVFIYCLEQLSTKPDADI 
DTTHLSVDVS S PDLFGS IHEDFSLTSEQRLL I VLSNCC YLERHT 
FLNIAEHFEKHNFQGIEKITQVSMASLKELDQRLFENYIELKAD 
P I VGSLE PG I YAGYFDWKDCLP PTGVRNYLKEALVNI IAVHAEV 
FTISKELVPRVLSKVIEAVSEELSRLMQCVSSFSKNGALQARLE 
I CALRDTVAVYLTPE S KS S FKQALEALPQLS SGADKKLLE E LLN 
KFKSSMHLQLTCFQAASSTMMKT 


6B29 


1 


782 

• 


MRMEAGEAAP PAGAGGRAAGGWGKWRLNVGGTVFLTTRQTLCR ' 
EQKSFLSRLCQGEELQSDRDETGAYLIDRDPTYFGPILNFLRHG 
KLVLDKDMAEEGVLEEAE FYNIGPL I RI I KDRMEEKD YTVTQVP 
PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSS YNYGSED ~ 
QAEFLCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EE VEVEQVQVEADAQEK/ CCYKPEAPGCEAPDHLQGLGVP I 


6830 


1: 

• 

* ♦ 


939 

• * 


MEPGSVENLSIVYRSRDFLWNKHWDVRIDSKAWRETLTLQKQL ' 

RYRFPEIJu^PDTCTGFRFCHQIiDFSTSGALCVALNKATKAGSAYR 

CFKERRVTKAYLALLRGHIQESRVTI SHAIGRNSTEGRAHTMCI 

EGSQGCENPKPSLTDLVVLEHGLYMDPVSKVLIjKPLTGRra 

RV\HCSALGHPVVGDLTYGE^SGREDRPFRMMLHAFYLRIP1OT 

ECVEVCTPDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 

DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCLQWLSEWT 

LEPDS 


6831 


3 


1087 

< 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKFISEVSREDYGKKEISGDSEEMNINSWTSADGENL 
EIQSYSLIGE KLVME EAKT I VP PHVTDS KRVQKPAI AP P S KWNI 
SIFKEEPRSDQKQKSLLSFDWDKVPQQPKSASSNFASKNITKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRSVS?TEKKDNIiENR\SYTL\AEKKVLAEKQNSV\APLELRDS 
NEIGKTQITLGSRSTELKESKADAMPQHFYQNEDYNERPKIIVG 
SEKEKDEKKKK 


6832 


1809 

• 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKKKRSEDDYEPIITYQFPKRENLLRGQQEEEERLLKAIPiiF 
CFPDGNE WASLTEYPRETFS FVLTNVDGSRKIGYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDBVEKRHQISMAVIYPFMQGL 
REAAFPAPGKTVTLKSFIPDSCTEFISLTRPLDSHLEHVDFSSL 
LHCLSFEQILQIFASAVLERKIIFLAEGLSTLSQCIHAAAALLY 
PFSWAHTYIPWPESLLATVCCPTPFMVGVQMRFQQEVMDSPME 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQ INEHVSGPFVQFFVKIVGHYAS YIKREANGQGHFQERS FCK 
ALTS KTNRRFVKKFVKTQLFSLFIQEAEKS KNP PAG YFQQKILE 
YEEQKKQ/TETKGKNCEIRAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPRVKSTRPGSSDIN 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVREPDHMEIiEEDRAGQLNMRGVFLHVLGDALGSVIVW 
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SEQ 
NO; 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Predicted end 
nucxeociae 
location 
cor re spending 
to first 
amino acid 
residue of 
ammo acio 
sequence 


Amino acid segment containing signal peptide 
lA=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, Valine, 
WsTryptopnan, Y=Tyrosme, X=»Unknown, *=Stop 
Codon / /«possible nucleotide deletion, 
\=possible nucleotide insertion) 






■ 


NALVFYFS WKGCSEGDFCVNPCFPDPCKAFVE I INSTHAS VYE A 
GPCWVLYLDPTLCVVMVCILLYTTYPLLKESALII^QTVPKQID 
IRNLIKELRNVEGVEEVHELHWQLAGSRIIATAHIKCEDPTSY 
MEVAKTIKDVFHNHGIHATTIQPEFASVGSKSSWPCELACRTQ 
CALKQCCGTLPQAPSGKDAEKTPAVSISCLELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSI> 


0004 


78 


1151 


AGQERPAP IWRLLWLPTPSVSRKAEPAHI P INR*GA*E *RGGLP 
LCGSSASAYGWH*RLTPWSPGGS*HM*SSKAPVTQAREVLVAGP 
CSKLVLSGARGIVGTTVQVLVEAQQPLLLLFTGVWGLNLRAGEE 
S RAL * LI EE VTQVRDAHLGNAWG CAQCLS QGQ VGSALAKALLE 
AAAAVRDCKEVLTVSGDKQQABVSVRL*VRDVCVEEAGCVEFGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRLLQAGAHSVAHGRRQGQAP CRPHQEAGVS CHE 
LQQWGDAL*ARE*APQ I IVLLLLEDVAQLRTGKKA*DLWDVE 
QLLRQL 


! 6835 


1 


834 


GIPAADR\EASLELIKLDISRTFPNLCIFQQGGPYHDMLHSILG 
AYTCYRPDVGYVQGMSF IAAVLI LNLDTADAF I AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFEVFFEENLPKLFAHFKKNNLTPDIYL 
IDWIFTLYSKSLPLDLACRIWDVFCRDGEEFLFRTALGILKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 

r 


850 


MS CGRPPPDVDGMITLKV \DNLTYRTSPDSLRRVFEKYGRVGDV 
YIPRE^HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
. KSRSRSKR.PPKSPEEEGQMSS 


6837 

- 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP \ SGTSSSGSKASGP 
PNP PAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQASGAAVGGS SAGET 
RGAPTPHEKALTSPSWGKGAELLLGDQPDL IGS LDGGAKSDSSS 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVTGSP 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDS ELGS CCSEAVKS AMS TI 
DLDSLMAEHSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


LTDTPPPKTHMIHHSISDYKATLRCWAIjGFYPMEITLTWQQDEE 
DQTRDMELVETRPAGDGTFQKWAAWVPSGEE / Q/ RYMCHVQHE 
GLPEPLTl^WEQSSQPTIPIVGIVAGLVLLGAVVTGAWSAVMC 
RKKNSDRVS YSEAASSDHAQGSDVSLTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLSWPQVKRIiDALLSEPIPIHG 
RGNFPTLS VQPRQIRAGGPQHPGGAG \ IHVHRVRLHGSAASHVIj 
HPESGLGYKDLDLVFRMDLRSEASFQIiTKAWLACLLDFLPAGV 
SRAKITPLTLKEAYVOKl^VKVCrrD5>nRM^TiTS1^55WK < ?nKN\rRTiTf 

FVDSVRRQFEFSIDSFQIILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVIATRSPEEIRGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
A7^RRYACLVTLHRWNES TVCLMNHERRQTLDL I AALALQALAE 
QGPAATAALAWRP PGTDGVVPATVNYYVTPVQPLLAHAYPTWLP 
CN 


6840 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLIRVDGKGSIKEL 
FPTGKQLEPLVAPl^GKVAVGQDDLTVVLNEEGICTQKCALNW 
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ID 
NO: 

• 

r 

- 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine , G=Glycine, ■ 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, v 
W=Tryptophan # Y=Tyrosine, X= unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFI 
TSGGSN 1 1 YVASNHFVWRL I P VPMATQ I QQLLQDKQFEIiALQLA 
EMKDDSDS EKQQQ I HH I KNLYAFNLFCQKRFDESMQVFAKLGTD 
PTHVNK3LYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHIiALIDY 
LTQKRSQL VKKIiNDSDHQSSTS PLMEGTPTI KS KKKLLQI IDTT 
LLKCYIiHTNVALVAPLLRLENNHCHIEESEHVIjKKAHKYSELI I 
LYEKKGLHEKALQVLVDQSKKANS PLKGHERTVQ YLQHLGTENL 
HLIFSYSVWVLRDFPEDGLKIFTEDLPEVBSLPRDRVLGFLIEN 
FKGLAI P YIiEHI IHVWEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
P FDGLLE ERALLLGRMG KHEQALF I YVH I LKDTRMAEE YCHKH Y 
DRNKDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPKANLQAA 
LQVIiEIiHHSKLpTTKALNLLPANTQINDIRI FLEKVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERILHQQVKCI ITEEKVCMVCKKK 
IGNSAFARYPNGWVHYFCS\KEVNPADT 


6841 

* 

.' 

•» 

' \ 
* 


1 

• ■ - >» - 

.... . * m ■• if «- t* - * 

I 


3206 

r 

»■ . 

*_ „ 7 ^« — - - » -*r * -it ■ " • - 

- 


TPSTTGTKSNTPTSSVPSAAVTPLNESLQPLGDYGVGSKNSKRA 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 
MCPETRLDRTGSSPTQGIVNK71FGINTDSLYHELSTAGSEVIGD 
VDEGADLIX3EFSGMGKEVGNLLLENSQLLETKNAIiNVVKNDLIA 4 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 
I IARRE PKEEAEDVSS YliCTESDKI PMAQRRRFTRVEMARVLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR ' n 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG r 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SLPAKYKQLS PNGGQEDTRMKNVPVPVYGRPLVEKDPTMKLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKWI IDANQPGTVVD 
QFTVCNAHVLCISSIPAASDSDYPPGEMFLDSDVNPEDPGADGV 
LAGITLVGCATO<^PR^CSSRG^ . [ . 
NPSQSTEEATEATEVPDPGPSEPBTATLRPGPLTEHVFTDPAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWL YVHSAVANWKKCLHS I KLKDS VLSLVHVKGRVLVALAD 
GTLAI FHRGEDGQWDLSNYHLMDLGHPHHS IRCMAWYDRVWCG 
YKNKVHVIQPKTMQIEKSFDAHPRRESQVRQLAW I GDGVWVS IR 
LDSTLRLYHAHTHQHLQDVDIEPYVSKMLGTGKLGFSFVRITAL 
LVAGSRLWVGTGNGWISIPLTETWIiHRGQ\LLG\LRAI«CTSP 
TSGEG\ARPGG\IIHVYG\DDSSDRAARSFIPYCSMAQAQLCFH 
GHRDAVKFFVSVPGNVLATLNGSVLDSPAEGPGPAAPASEVEGQ 
KLRNVLVLSGGEGY I DFRI GDGEDDETEEGAGDMS QVKPVLSKA 
ERSHI I VWQVS YTPE 


6842 


3 


926 


RCQQLSATILTDHQYLERTPLCAILKQKAPQQYR I RAKLRS YKP 
RRLFQSVKLHCPKCHLLQEVPHEGDLDIIFQDGATKTPDVKLQN 
TSLYDSKIWTTKNQKGRKVAVHFVKNNGILPLSNECLLLIEGGT 
LS E I CKXSNKFNS VI PVRSGHEDDELLDLS APFL I QGTVHHYGC 
KQWST * RS IQNLNSLVDKTS WI PSSVAEALGI VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQ I PASEVLMDDDLQKS VDMIMDMFC 
P PGIKIDAYPWLECFIKSYNVTNGTDNQICYQI FDTTVAEDVI 






851 


NHRKVLSGAKRYECNECGKS FAYTSS LI KHRRIHTGERPYECSE 
CGRSFAENSSLI KHLRVHTGERPYECVECGKSFRRSS SUjQHQR 

vhtrerp yecsecgks fslrsnlihhqrvhtgerhecgqcgks f 
srkssliihlrvhtgerpyecsdcgksfaenss&ikhlrvhtge 
rpyecidcgksfrhsssfrrhqrvhtgmrpyk*skfwkfscpgf 
lllqgqrvhtgsrcyecdkwgiffs*nasff , r*ksapteevpfe 

CNECEKAFSPLSLVTTIFT 


6844 


244 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYIjSPQEIjEDVFY 
QYDVKSE I YS FG I VLWEIATGDI P FQGCNSEKIRKLVAVKRQQE 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA^iuanine, t-ecysteine, D=Aspartxc Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
n-nx&Liaine, i-isoieucme, K.=Lysme, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








PLGEDCPSELREIIDECRAHDPSVRPSVDEILKKLSTFSK*CIK 
I 


6B45 


3 


1519 


VAVRDECY WRHVFWDQDLWMLLF I LMCHPETARARLE YR I RTLD 
GALENAQNLGYQGAKFAWESADSGLEVCPEDIYGVQEVHVNGAV 
GLAFELYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEWSPREE 
Ki HJjRGVMS PDEYHSGVNNSVYxNVLvQNSLRFAAALAQDLGLP 
I PSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
YPVPFSLSPDVRRKNLEIYEAVTSPQGPAMTWSMFAVGWMELKD 
AVRARGLLDRS FANMAE P FKVWTENADGSGAVNFLTGMGGFLQA 
WFGCTGFRVTRAGVTFDPVCLSG I SRVSVSGI FYQGNKLNFS F 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQS PLWVTLGSSSP 
TESLTVDPASE * SGTGASETSLGPSLWPRLHPPLIjGTLLACHPS 


6846 


213 


1258 


LYFLKTIK*LNR3^AEHP*YENEKLTKLROTIMEQYTRTEESARG 
1 1 FTKTRQSAYALSQW I TENEKFAE VGVKAHHL IGAGHSSEFKP 
MTQNEQKEVISKFRTGKINLiLIATTVAEEGLDIKECNIVIRYGL 
VTNE I AMVQARGRARADESTYVLVAHSGSGVI EHETVNDFREKM 
MYKAIHCVQNMKPEEYAHKILELQMQSIMEKKMKTKRNIAKHYK 
NNPSLITFLCKNCSVLACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTLQKKC^YQINGEIICKCGQAWGTMMVHKGIiDLPCIjKIR 
NFVWFKNNSTKKQYKKWVELP I TFPNLDYSECCIjFSDED 


6847 

\J w TC f 


1/ Eft 

' » 


1 A Q 


SMCWNSDRLEMPLIDLALILYPPSYVPYTGHLSDDSLSRKYCLT 
WFEDALNGVL*RAEAIQPH(^7NAGDRMEKFRQ 
PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGWRSLDALGWEERQLALVKGLLAGNVFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDS YSEWLQRLKGPPHKCAL I FADNSG 
ID I ILGVFPFVRELLLRGTEVIIACNSGPALNDVTHSESLIYAE 
R I AGWDPVVHSAIjREERLLIjVQTGS s S PCLDLSRLDKGLAALVR 
ERGADtWIEGMGRAVHTNYHAALRCESLKIAVIKNAWIiAERLG 
GRLFSVTFKYEVPAE 


6848 


19 

r 


16 

* 
* 


AMWWNSLJDGIRNIVLSNPKKROTLSLAMLKSLQSDILHDADSND 
LKVIIISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
IRNHPVPVIAMVNGLATAAGCQLVAS CDIAVASDKS SFATPGVN 
Vvslir yt? IFUVAljAl^VPKK.VAi_jEMLr TGEPISAQEALLHGLIjNK 
WPEAELQEE TMR IARK I AS LSRPWS LGKATF YKQLPQDLGTA 
YYLTSQAMVDNIiALRDGQEGITAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SLGVDGSCLEQGSPAPRPQTDTSP*PVGNWATQQEDLYHQSYEC 
VCVLFASVPDFKEFYSESNINHEGLECLRLLNE I IADFDELLSK 
PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHLGTM 
VEFAVALGSKLDV1NKHSFNNFRLRVGLNHGPWAGVIGAQKPQ 
YDIWGNTVNVASRMESTGVLGKIQVTEETAWALQSLGYTCYSRG 
VI KVKGKGQLCTYFLNTDLTRTGP PS ATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD 

T TAT 7T WT TTl T THOU WTTCR WTOAIilll1\TT ^T7T TTT riTTOnT mmnm* nnnv 

IjDVIjKijEJLjI PEAKI PAK I SQMTNLQELHLCTCPAKVEQTAFS FL 
RDHLRCLHVKFTDVAE I PAWVYLLKNLRELYL IGNLNSENNKM I 
GLESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLNSIiKKMMNVAELELONCELERI PHAIFS LSNLOELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
YFSNNICLESLPVAVFSLQKIiRCLDVSYNNISMI PIE IGLLQNLQ 
HLHITGNKVDILPKQLFKCIKIiRTLNLGQNCITSLPEKVGQLSQ 
LTQLELKGNCLDRLPAQLGQCRMLKKSGLWEDHLFDTLPLEVK 
EALNQDINIPFANGI 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGXiMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQEIjDEEEPDIWFDFET 
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SEQ 
NO: 


Predicted 
ucyi xlxi x ny 
nucleotide 
location 
corresponding 
to first 
amino acid 

ye* a S HllP 
LColUUC LJJL 

amino acid 
\ sequence • 


Predicted end 
nucxeoc lue 
location 
corresponding 
to first 
amino acid 
residue of 

om*i n aril A 

sequence 


Amino acid segment containing signal peptide 
lA=Aianine, u=uyscexne, D»Aspartic Acid/ je= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
w= irypcopnan, i — iyrosine, A=uiucnown, T =c»cop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE 
rirUN 1 WoA^iNUJSiNuo^V JjUUoiv^Jjr ±yufr\jjjjt Ar»rl\JJ/\Gr rrSQD 
INSHLASLSMARNTSPTPDPTVREALCAPDNLNAS IESQGQIKM 
YINEVCRETVSRCCNS FLQQAGMLLISMTVINNMLAKSASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 

1? T DWnMD P T T .T .T?TD Ti D 


6852 


1 


407 


RTRGEETYANFIKHNDGKNIFYAARTPATLFAVMFAMYIISGLT 
GFIGLNSIAVLO^VMGLALIFLCTWAYVKYSGEFREIGTVIDQ 
IAETLWEQVLKPLGDNLMEENIRQSVTNSIKAGLTDQVSHHARL 
KTD 


OOjj 


*> 
J 


ACQ 


(jJJia t-A V C ltLLtl )\fNVu VKX u X CW H 1 r riK TCVDP WLLEHRTCPMC 
KCDILKALGIEVDVEDGSVSLQVPVSNEIFNSASSHBEDNRSET 
ASSGYASVQGTYEPPLEEHVQSTNESLQLVNHEANSVAVDVIPH 
VDNPTFEEDETPNQETAVREIKS 


6854 


1148 


585 


HESYIGTFDPGELCVCAAIQWLQDNSASYFIiNRKLVYEPSTQAK 
PVKNTFLRMWIYSHHIYQQDIjRKKILDVGKRLDVTGFCMTGKPG 

T T/^X7I7r , TrVT7U/^T?l?'n , ljinJ r PTD VDMMVtIT Or"VUJ\D017C"PP/ , 'Mr'DnT T3 

X lL.vr>tjr JkUiriL-JtiiCir Wn 1 ±KXrTvWKnloL.Jt\jiAiio vh. IJiGNGliUijK 

lfhsfeel^leahgdyglrndyhmnlgqfleflkkhksehvfqi 
lfgieskssds 


6855 


1913 


1148 


GRVGGRVGRI CS PLSGANE YIAS TDTIiKTEE VLLFTDQTDDLAK 
EEPTSLFQRDSETKGESGLVLEGDKEIHQIFEDLDKKIiALASRF 
YIPEGCIQRWAAEflWVAliDAIjHREGIVCRDI^ 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL ;t 
GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARSLIQQL 
LQFNPLERLGAGVAGVEDI KSHPFFTPVDWAELMR 


6856 


1617 


' 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKAIAVKQQR 
TV x K JjTLjVKA WW vu&ijQA X AQ UVS hGNPDFl E VKG VTYCGES S A 
SSLTMAHVPWHEEWQFVRELVDLIPEYEIACEHEHSNCLLIAH 
RKFKIGGE^TWINYW^ELIQEYEDSGGSKTFSAKDY 
HWALFGASERGFDPKDTRHQRKNKSKAISGC 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNHIKPSHSAAQTWCGSPTPASAPNH 
KLMAMEQGKTLPSATEDAKEEGLEAQ ISRLAEL I GRLESKALWF 
■UJ-» Wv KJjbJJii DC* 1 NMHLiy Li VKU bMAV C JPbyjjSEFijDSJjRQYLRGT 
TGVRNCFHI TAVRLSDGFTFVI YEFWETEEAWKRHLQSPLCKAF 
RHVKVDTLS QP EALSR I LVPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA 
LRCVQTAKL.ILEELKLEKKI KIRVEPGI FEWTKWEAGKTTPTLM 
SLEELKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
IVNTCPQOTGVILIVSHGSTLDSCTRPIiLGLPPRECGDFAQLVR 
KIPSIXSMCFCEENKEEGKWEIiVNPPVKTLTHGANAAFNWRNWIS 
GN 


6859 


i 

■ 




riTTTMTPWAVTVZ^ virv'D'D vd cnccr^nvwT.cnTT^CDCciv^T t ircr* 
(jCi xnc ]\]SJ\]\ 1 1V>U\J^J^X'KJ^K,•^ Uo bovj I IN J->o JJX X roo IuXjLiJNoo 

KTNSVESLPELLTSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 
AKVKPYVNGTSPVYSREDLKPWEKSPILKISAPQPIPSNRIDTT 

VSHGVXLSQKQRKMIALTTKENNSGMNSMErVLFTPSKAPKPVN 
AWAS S LHSVS S KS FRDFLLE E KKSVTSHS SGDHVKKVS FKG I EN 
SQAPKIVRCSTHGTPGPEGimiSDIjPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 
YEAFGNPEEFVIVERTPQGPLAVPMWNKHGC 


6860 


1889 


1515 


DKD KKRQKKRG I FP KVATN I MRAWLFQHLTHP YPS EEQKKQLAQ 
DTGLTXLQVIINWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
MGSFVl^GQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


DKI) KKRQKKRG I FPKVATN I MRAWLFQHLTHPYPS EEQKKQLAQ 
DTGLT I LQVNNWF I NARR 1 1 VQPMIDQSNRAVSOGAAYSPBGQP 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cvsteine, D^Asoartic Acid E- 
Glutamic Acid. F= Phenyl alanine, G=Glvcine, 
H=Histidine, Ielsoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVLDGQQHMG IRPAGPMSGMGMNMGMDGQWHYM 


£862 


2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA - 

DEEDPLGPNCTYDKTKSFFTDNISCX>DNRERRPTWAEERRLNAET 

FGIPLRPNRGRGGYRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGREFADFEYRKTTAFGP 


6863 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCKQVCSTVGGS 
AI CS CFPGYAIMADGVS CEDQDECLMGAHDCSRRQFCVNTLGS F 
YCVNHTVLCADGYILNAHRKCVDINECVTDLHTCSRGEHCVNTIi 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHTCQPGFLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSLSEPCRPGFSCI 

Xi J. TWUJL X V<yiUl£ UxWVlvu X ALfliJ U\J\J X IVta. VU VIM uUuluVIlKLwuU 

QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GSYQCTCRQGYQI^DGHTCTDIDECAQGAGIIjCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCHNIQGS 
FRCLRFECPPNYVQVSKTKCERTTCHDFLECQNS PARITHYQLN 
FQTGLLVPAHI FRIGPAPAFTGDT3 ALNI IKGNEEGYFGTRRLN 
AYTGVVYLQRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHIFFT 
TFAL 


6864 


2 

• 


2933 

> 

* 


LADSSPSNLQ 1 1 IKELLSMHHQPDPALTKEFDYLPP VDSRSS SG 

FVGLRNGGATCYMNAVFQQLYMQPGIjPESLLSVDDDTDNPDDSV 

FYQVQSLFGHLMESKLQYYVPENFWKI FKMWNKEliYVREQQDAY 

EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 

EREEAFMALNLGVTSCQSLEISLDQFVRGEVLEGSNAYYCEKCK 

EKRI TVKRTC I KSLPSVLVTHLMRFGFDWESGRS I KYDEQ I RFP 

WMLNME P YTVSGMARQDSSS EVGENGRS VDQGGGGS PRKKVALT 

ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 

EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 

QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 

PHRPNNDRLS I LTKLVKKGEKKGLFVEKMPARI YQMVRDENLKF 

MKNRDVYSSDYFSFVl^IiASLNATKLICHPYYPCMAKV^ 

FLFQTYLRTKKKLRVDTEEW IATIEALLSKSFDACQWLiVEYF I S 

SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKLKSL 

HOLLE VLLALLD KD VPENCKN CADYF FL FMTFVOICOR T R ZLf?nT . T. 

LRHS ALRHMI S FLLGASRQNNQ IRRWS SAQAREFGNLHNT VALL 
VUISDVSSQRIWAPGIFKQRPPISIAPSSPLLPLHEEVEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKNTFQLLHE I LVI EDP IQVERVKFVFETEN 
GLLALMHHSNHVDSSRCYQOTKFL^l^OKCPAAKEYFKENSHH 
WSWAVQWU3KKMSEHYWTLQSNVSNETSTGKTFQRTISAQDTLA 
YATALLNEKEQSGS SNGSESS PANENGDRHLQQGSES PMM I GEL 
RSDLDDVDP 


6865 


1820 


1242 


D P E RWKHLS KVT P PGS S VS TTP VQ WRLQ S PQ SQGSMM P S CNRS 1 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAAD YNQALGTCRLAGTALCVAAGVLLAI CLFWAM IGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQLSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRDLDWAW INAVSAFKALEQDLPVN I KF 
IIEGMEEAGSVALEELVEKEKDRFFSGVDYIVISDNLWISQRKP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
SLVDSSGHILVPGI YDE WPLTEEE INTYKAIHLDLEE YRNS SR 
VEKFLFDTKEEILMHLWRYPSLS IHGIEGAFDEPGTKTVI PGRV 
IGKFSIRIiVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMWSMTL 
GLH P W I ANI DDTQ YLAAKRAI RT VFGTE PDM I RDGS T I P I AKM F 
QE I VHKS WL I PLGAVDDGEHSQNE KINRWNY I EGTKLFAAFFL 
EMAQLH 
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amino acid 
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sequence 


Predicted end 
nucleotide 
location 
corre sponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alaninp CsCvateinp D— Aorta i r> 7\r*-i 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine f I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6867 


2833 


1704 


GTR I MSOP ROKRT » AC3FVRO K"MT iT.n Y<? WIVW^T? CVPOR Q DOW «J D 

LQSAESSPTAGKKLPEVPPSEEEEQEAWVNALLGRI FWDFLGEK 

yWSDLVSKKIQMKLSKIKLPYFMNELTliTEIiDMGVAVPKIIiQAP 
K P YVDHOGLW I DLEMS YNG S FTJ/TLET KMNI/T lTLfi TTRDT.waT.TT 

VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEFIKKKIEE v 
VSNTPLLLTVEVQECRGTIiAVN I P P P PTDRVW YGFRKP PHVELK 
ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


RPTRPPTRPEEIKNLILPYISDMNFVQDLCEDFYELFKTDKGFD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGRI VHLSNS FTQTVNCRKPFFSS W 


6869 


3 

* 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQLVGEIYQNFFVES 
KE I S VEKSLYKE IQQCLVGNKGIEVFYKIQEDVYETLKDRYYPS ' 
FIVSDLYEKLLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 
NQINEQASFAVNKLRELNEKLEYKRQALNSIQNAPKPDKKIVSK 

NGEQLPCYFVMVSLQEVGGVETKNWTVPKPXSEFHNLHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 
LCQSEALYAFLS PS PDYLKVI DVQGKKNSFSLSSFLERLPRDFF " 
SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMLIGEIFELRGM . 
FKWVRRTLIALVQVTFGRTINKQIRDTVSWIFSEQMLVYYINIF- 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 
VGQQNARHG 1 1 KI FNALQETRANKHLLYALMELLLI ELCPELRV 
HLDQLKAGQV 


6870 

• 


1 

■ 


1566 

r 


MAAVVAATRWWQLLLVLSAAGMGASGAPQPPNILLLIjMDDMGWG 
DIXSVYGEPSRETPNLDRMAAEGLLFPNFYSANPLCSPSRAALIiT 
GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQLLPELLKKAG 

NIPVYRDWEMVGRYYEEFP INLKTGEANLTQI YLQEALDFIKRQ • 
ARHHP FFLYWAVDATHAPVYAS KPFLGTSQRGRYGDAVRB I DDS 
IGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC • 
GKQTTFEGGMREPAIAWWPGHVTAGQVSHQLGSIMDLFTTSLAL ! 
AGLTPPSDRAITCLNI^PTLLQGPJLMDRPIFYYRGDTLMAATLG ' 
QHKAHFWTWTNSWENFRQG I DFCPGQNVSGVTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYOEALSRITSWOOHORAIjVPAOP ' 
QLNVCNWAVMNWAPPGCEKLGKCLTPPESIPKKCLWSH 


6871 • 


209 


1126 


RMSLNPP I FLKRSEENSSKFVETKQSQTTS IASEDPLQNLCLAS 
QEVLQKAQQSGRS KCbKCGGSRMFYC YTCYVPVENVP I EQI PLV 
KLPLKIDIIKHPNETDGKSTAIHAKLLAPEFVNIYTYPCIPEYE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFIiVDYHT 
DILKEKYRGQYDNLLFFYSFMYQLIKNAKCSGDKETGKliTH 


6872 


880 


459 


FGUjMWLSLIFMKGNCVREDLI FNFLFKLGLDVRETNGLFGNT 
KKLITEyFVRQKYLEYRRIPYTEPAEYEFLWGPRAFIiETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECBWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCN 
I IHTE I FGFSNNYWELRRRRP KLKKLKKLLMENP YEGPDSQKEK 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
D YEMKLLNHVTQLTOS ES WS FGKVPLNTCLQELGPLE PE EM I EH 
CLKCYGKKYVDEGE VYFELDADKI CRAAARMLLQNAVKFNLAEF 
QEVWQQSVPEGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQBRFNSIjFSIjREKWTEEDIAPYIQDLCGEKQTIGALLTKYSH 
SSMQNGVKVYNSRRPIS 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanme, C=Cystexne, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine,. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6874 


1 


307 


DS IADHVNSAAVNVEEGTKNLGKAAKYKLAALPVAGALIGGMVG 
G P IGLLAG FKVAGI AAALGGG VLG FTGGKL I QRKKQKMME KLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIIHSISLLNAEEHSIA 
TLLLRIEKBELDMKGSGFYVSLEWVTISKKNQDNKKYEIIKRDI 
LRGKSVPHYAAIEPDGNGLMIVSYKSLTFVQAGQDLEENMDEDI 
SEKIKEPLYYWQQTEDDLTVTIRLPEDNTKEDIQIQFnPDHINI 
VLKDHQFLEGKLYSS IDHESSTWI IKESNSLEISLIKKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAER1»MHLTSEELNPNPDKEKPP 
CNAQELEECDIFFEESSSLCRFIX3NTLKTTHVVNLGSNQYLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVLYNR 
KEGRQVGCVAKQQVASLETNDP I LGFQATNERLFVLTTKNLFL I 
KVNTEN 


6876 


41 


1285 

■ 


VGEMTL I WRHLLRPLCLVTS APR I LEMHP FLS LGTSRTS VTKLS 
LHTKPRMPPCDFMPERYQVIFLWSGSEANELAMLMARAHSNNI 
DI I S FRGAYHGCS PYTLGLTNVG I YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDS PVQT I RKCSCAPDCCQAKDQYI EQFKDTLSTS 
VAKS I AGF FAE P I QG VNG WQYP KGFLKEAFELVRARGGVC IAN 
EVQTGFGRLGSHFWGFQTODVLPDIVTMAKGIGNGFPMAAVirT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMLLKFAKLRDEFE I VGDVRGKGLMIGIEMVQDKIS CRPL 
PREEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTS PSPARAYAPPTERKRFYQNVS ITQGEGGFEINLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDT I KYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVELQRNEWDPII 
EWAEKRYGVE I SSSTS IMGPSIPAKTREVLVSHIiASYNTWALQG 
IEFyAAQLKS^^TIJGLIDLRLTVEQAVLLSRLEEEYQIQKWGN 
IEWAHDYELQELRARTAAGTLF IHLCSESTTVKHKLLKE 


6878 


931 


263 


QTLQGDFKNRAEMIDFNIRIKNVTRSDAGKYRCEVSAPSEQGQN 
LEEDTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAP 
EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGEYSCEARNSVGYRRCPGKRMQVDDLNISGI IAAWWALVIS 
V CGLG VC YAQRKG YFS KETS FQKSNS S S KATTMS ENDFKHTKS F 
II 


6879 


3 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK 
KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVI SHS FVRKLAPNE 
FPHKLYIQNYTSAVPGTCLTIRKWLFTTEEEILLNDNDLAVTYF 
FHQAVDDVKKGY I KAEEKS YQLQKLYEQRKMVM YLNMLRTCEGY 
NEI I FPHCACDSRRKGHVITAlSITHFKLHACrrEEG 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 
CFERVFCELKWRKEEY 


coon 
b do U 


4 110 


"1 A ^ *7 

1437 


RKDNL I AKEWTFPEAKWNTTAKVFbHiKJjGMGHVljl IVyLFIoo 
MANI YNEKILKEGNQLTES I FIQNSKLYFFGILFNGLTLGLQRS 
NRDQIKNCGFFYGHRAFSVALIFVTAFQGLSVAFILKFLDNMFH 
VLMAQVTTVI ITTVSVLVFDFRPSLEFFLEAPSVLLSIFIYNAS 
TTPOVPEYAPRORR TRnT.QnMT.lxrPT? < ? € ?GnGP"RT.FRTjTK;PKSriR5>D 

.TV. XT \i V r D Lrlxivy d Iv J- fvl_/ VJINl UirVXlirv J J VJL/\J£U "i 1 II iIn.J-1 i. IxTiujLIDijU 

EDTF 


6881 


2638 


2244 


NDSKWEDIHVITGALKMFFRELPEPLFTFNHFNDFVNAIKQEPR 
QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQEPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLE I EGRDCGEATAQW I TS FLKSQP YRLVHFE PH 
MRPRR PHQ 1 7VDL FRP KDQ I AYS DTS P FL I LS EAS LADLNS RLE K 
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SEQ 
ID 
NO: 

■ 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Amino acid segment containing signal peptide 

1 (A=Alanine, C«=Cvsteine . DcAsnartic Acid R— 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 

1 L=Leucine, M=Methionine, N=Asnaraaine . 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 

I \«possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYABDSWDELLIGDVELKRVMACSRrT 

LTTVDPDTGVMSRKEPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTIKVGDPVYLLGQ 


6883 

• 


2794 


2256 


NSKLKLNQNLKLFITLTYQVLSLHGWGPGIHLQKEGAFPVTQNR 

ALQLLYDLRYIiNIVLTAKGDEVKSGRSKPDSRIEKVTDHLEALI 

DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 

NSQEPHNILPLASSQIRFGLLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


j E FER VTAEA VKPRE TS E PRAAAQR FCEKFP FL 


6885 




1554 


STGQFWHVTDIjHLDPT YH I TDDHTKVCASSKGANASNPGPFGDV 

TVINVITNMTTTIQSLFPNLQVFPALGNHDYWPQDQLSWTSKV 
YNAVANLWKPWLDEEAISTLRKGGFYSQKVTTNPNLRI ISLNTN - 
LYYGPNIMTLNKTDPANQFEWLESTLNNSQQNKEKVYIIAHVPV 
GYLPSSQNITAMREYYNEKLIDIFQKYSDVIAGQFYGHTHRDSI 
MVLSDKKGS PVXISLFVAPAVTPVKS VLEKQTNNPGIRLFQYDPR 
DYKLLDMLQYYLNLTEANLKGES I WKLEYILTQTYDIEDLQPES 

LYGLAKQFTILDSKQFIKYYNYFFVSYDSSVTCDKTCKAFQICA 
IMNLDNISYADCLKQL Y I KHNY 


6886 

* 


2 

• 


1341 


QCGGIPGREGGSSRPLEEGTGSSPACVRGAAPGSEDAFYPTRAK 
QARVSQELKKAAKRTVS ISEGPDTLGDGMRERRETLALAPEPEP - 

GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS ^ 

LLATGGADRL IHLWNWGS RLEANQTLEGAGGS I TS VDFDPSG Y 

QVIiAATYNQAAQLWKVGEAQS KKliiSGHKDKVTAAKFKLTRHQA 

VTGSRDRTVKEWDLGRAYCSRTINVLS YCNDWCGDHI 1 1 SGHN 

DQKIRFWDSRGPHCTQVIPVQGRVTSLSIiSHDQIjHLLSCSRDNT 

LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFSPDRSYALAGSCD 

GALYIWDVDTGKLESRLQGPHCAAVNAVAWCYSGSHMVSVDQGR . 
" KWLWQ 


6887 : 

■* 1 

1 
■ 

* I 
r ] 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGdCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGIAAQTPPTSKQVAWRAFLTG 
TYRSQS PRS PAGP FRGGTGWWPE PAVCLCVAVGPQRLS SPGLVY 
NASGS EHCYD I YRL YHS CAD PTG CGTG P DARAWDYQ ACTE I NLT 
FA5NNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTSFWGG 
DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRASHPEDPASWEARKLEATI IGEWVKAARREQQPALRGGPRL 
SL 


6888 


1 


992 


FVAYVKKEIPHIWTHCLLNPHALVIKTLPTKXjRDALFTWRVI 
NFIKGRAPNHRLFQAFFEEIGIEYSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFMHKSSl^VDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFSIGDLNEASKWI 
IiDPFLFNIDFVDDSYLMKTOIiAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE ILMPFATTYLCELGFS ITFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 j 


LTLENQIKEEREQDNSESPNGRTSPLiVSQNNEQGSTLRDLLTTT 
AGKLiRVGS TDAGIAFAPVYSMGAPSSKSGRTMPNILDDI IASW 
ENKI PPS KTS KINVKPELKEE PEESI ISAVDENNKLYSDI PHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAVVSGVHKKMNISL 
WKAES ISLDFGDHQADLLNCKDS 1 1 SNANVKEFWDGFEEVSKRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPG FFVRPDLGPRLCSAYGWAAKDHDI GTTNLH I 
EVSDWNILVYVGIAKGNGILSKAGILKKFEEEDLDDILRKRLK 
DSSEIPGALWHIYAGKDVDKIRBFLQKISKEQGLEVLPEHDPIR 
DQSVm/NKKLRQRIJjEEYGVRTWTLIQFLGDAlVIjPAGALHQVQ 
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I SEQ 
ID 
I NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cvsteine. D=A<*-na *rt- -1 n &r»-?/i p— 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine . K=Lvsi np 
L=Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown # *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHIjVESFHLTQELRIJjKEEINYDDKIjQVK 
NILYHAVKEMVRALKIHEDEVDDMEEN 


6890 


3 


667 


THACGMWI PLYLHRALWHKTAETCNSPPCGAKDSLI FGAITCF 
TGFLGVDTGAGATRWCRLKTQRADPLVCAVGMLGS AI FICLIFV 
AAKS S I VGAYI CIFVGETLLFSNWAITADI LMYWI PTRRATAV 
ALQSFTSHLLGDAGSPYLIGFlQnt.TPnQTTrnQDT.wf'PTPT cr /<va 

LMLCPFWVLGGMFFLATALFFVSDRARAEQQVNQLAMPPASVK 
V 


6891 


1980 

■ 


1262 


LRIHQELLSKELKLLRGITIESI IHIGLAAGKEQFMQDASNVMQ " - 

LLLKTQSHLYNMEDNNPETOQAAAYGLGVMAQFGGDDYRSLCSE 

AVPLLVKVI KRAHSKTKKNVIATENCISAIGKI LKFKPNCVNVD 

EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPWIGPNNSN 
LPKI I SI IAEGKINETIKTYEDPCAin3r.AWWPnvrvrQT?riT t»tt x?n 

VSQLDDEQQEALQELLNFA 


6692 


3 


876 


RSVAAASGPGAWGTDHYCLE^RIOiDYEGYLCSLriLPAESRSSV"" 

FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 

QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKELE 

NYAENTQSSLLYLTLE ILGI KDLHADHAASHIGKAQG1VTCLRA 

TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDKNVRDVIYDIA 
SOAHLHLKHARSFHKTVPVK , AFPAPT.nT\/c:T,T?nT?r.w - vTOT3T7r^ , irT^ 

I FHPSLQQKNTLLPLYLYIQS WRKTY 


6893 


1 


1 842 


DGERKSMSVERTFSElNKAEEQYSLCQELCSEIiAQDLQKERLKG 
RTVTIKLKNVNFEVKTRASTVQ^VVQTawP'TPa Ta ttitt t vpt?tt\ 

J- i\jjivn Yin, jjv i\i X\-rVkJ i. V JO V V O L/UiuXr Al AIvJGiAmjJ^ JL £>XU 

ADFPHPLRLRLMGVRISS FPNEEDRKHQQRS I IGFLQAGNQALS 

ATECTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVNKQS FQTSQPFQVLKKKMNENLE I SENSDDCQILTCPVCFR 

A<^C ISLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKE 
NVPAS S LCEKQD YEAH 


6894 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 

DVRDRGHGRPWQPSIjEPSIiPPTIjCFPSIiSSFSSSWPSAQHLTPS 
VFNPW g 


6895 


2379 


478 


VTYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLIJUjLYAI^SHKACKIAILHLIN^ 

VRSPGDSVIRQQCVEYVTSILQSLCDQDiALILPSSSEGSISEL 
EQLSNSLPNKELWTSICXICL1ATLANSESSYNCLLTCVRTMWFL 
AEHD YGLFHLKS S LRKNSS ALHS LLKRWS TFS KDTGELAS S FL 

E FMRQ ILNSDT IGCCGDDNGLMEVEGAHTSRTMS INAAELKQLL 
OSl<EESPENLFLELE!n"i\rriRH^Trnn'nwT.'nQT.T,nc'\Arr , T vamt t?o 

SGDPLPLSDQDVEPVLSAPESLQI^FNNRTAYVIjADVMDDQLKS 

MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSELERSFL 

SEPSSPGRTKTTKGFKl^KHKHETFITSSGKSEYIEPAKRAHVV 

PPPRGRGRGGFGQGIRPHDIFRQRKQNTSRPPSMHVDDFVAAES 
KEWPQDGI PPPKRPLKVSQKI S S RGGFS GNRGGRGAFTT«;nNR T? 

FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 

PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVSGGSGRGRHVRSFTR 


6896 


1 


555 


GN I VIQKKKYNKQHI I PLENVT IDS IKDEGDLRNGWLI KTPTKS 

FAVYAATATEKSEWMJWINKCVTDLLSKSGKTPSl^EHAAVWVPD 

SEAWCMRCQKAKFTPVNRMHCRKCX3FVVCGPCSEKRFLLPSQ 

SSKPVRICDFCYT5LLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


Gl^LMHEVWGLMERPDWETAIQKPLCSLPAGSGNAIJ^SLNHY 
AGYEQVTNEDLLTNCTLIJ^CRRIaL^ 

SLAWGFIADVDLESEKYRRLGEMRFTI^TFLRIjAALRTYRGRLA 

ylpvgrvgsktpaspvwqqgpvdahlvpleepvpshwtvvpde 
dfvlvlallhshlgsemfaapmgrcaa^vmhl fyvragvsraml 



568 
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SEQ 
ID 
NO: 


Predicted 

bf=»Q"i titi 1 ncr 

nucleotide 
location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

UUCicOLxQc 

location 

powpcnnnH i Tier 

to first 
amino acid 
residue of 
amino acid 

ocuuc in- & 


Amino acid segment containing signal peptide 
{A=Aianine, C=Cysteme, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
flBmsciQine, i=isoieucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
v-oaoii, / -posBiDie nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYL\nfVPVVAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVASLLKGRQGIYTENERRMGAVIKIRFFKIMLVLUCW 
LSNIINESLLFYLEMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
y(j 1? lilib L»AF YCr WTGCS IjG FQS PRKE I Q WES LTTS AAEG AH P S PL 

MPHENPaSGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASESC 
NKNEGDPALPTHGDL 


6899 


120 


827 


MKVRKNNDAYLLDKNKINMDCFISCFFK3^TTL)MFSHSGILSL 
ijJiHGEE YTFSJjPCAYARS I LT VP WV E LGGKVS VNCAKTG YS AS I 
T FHTKP FYGGKLHRVTAE VKHN I TNTWCRVQGEWNS VLE FTYS 
vt\ja i is. i v jjjj l ru_iAv i JxJvK VKfi-iiUxyUPr isbKKli WKNVj. DSLRES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 

11 r\. it jj rr r\. J. J. ±r J. » y r"" 


6900 


3 


451 


TEVLGSKGIHELRSSTSALHHALEESASl^TMFWRAALPSTHIP 

MEQFIVSQLTRTHDVLKKARTNLEVRKLLHQSEAPSLSPTHHHP 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDNMVQRLETDFKMTLQQQSrLEQWAAWLDNVMMQALKPYEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSAliEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFEWNv 


6903 


1 


149 


RINQVYRQGPTGIHILVIDQMVQNFQDESCFLFSTVKAESSDGI 
HULK 


6904 

■ 


464 

1 

* 


2092 

■ 


MEASLPVSLSCVLACGDVEGKFDILFNRVQAIQKKSGNFDLLLC 
VGNFFGSTQDAEWEEYKTGI KKAPIQTYVLGANNQETVKYFQDA j 
DGCELAENITYLGRKGIFTGS SGLQ I VYLSGTESLNEPVPGYSF 
SPKDVSSLRMMLCTTSQFKGVDI LLTS P WPKCVGNFGNSSGEVD 
TKKCGSALVSSLATGLKPR YHFAALEKT YYERLP YRNHI I LQEN 
AQHATRFIAL7\NVGNPEKKKYLYAFS IVPMKLMDAAELVKQPPD 
VTENPYRaSGQEAS IGKQI LAPVEESACQFFFDLNEKQGRKRSS 
TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHLWNIGTH 
CYLALaKGGLSDDHVLILP IGHYQSWELSAEWEEVEKYKATL 
RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDI KDAF ' 
ITQAQEQQ I ELLE I PEHSD I KQ IAQPGAAYFYVELDTGEKLFHR 
ixviuxr JrliyroKE vliAo c*Al JaW VFUJ^DWKyLylSKEDEETLARR 
FRKDFEPYDFTLDD 


6905 . 


1 

• 


226 


VSKTGEAETITSHYLFALGVYRTLYLFNWIWRYHFEGFFDLIAI 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


S YDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKI I PAI 
ATTTATVSGLVALEM I KVTGG YP FEAYKNWFLNLAI P I WFTET 
TEVRKTKI RNGI S FT I WDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWQGVKMLYVPVMPGHAKRLKLTMHKLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRLTRYSQGDDDGS j 

SSSGGSSVAGSQSTLFICDSPLRTLKRKSSNMKRLSPAPQLGPSS 

DAHTSYYSESLVHESWFPPRSSLEEDHGDANWGEDLRVRRRRGT 

ggsessrasglvgrkatedpTjGs ^sgy*; ^fddyvgysdvdoo^ 

ssrlrsavsragsllwmvats pgrlfrllywwagttwyrlttaa 
slldvfvltrrfsslktflwfllplllltcltygawyfypyglq 
tfhpalvswwaakdsrradegweardssphfqaeqrvmsrvhsl 
errlealaaefssnwqkeamrlerlelrqgapgqggggglshed 
tlalleglvsrreaalkedfrretaariqeelsalraehqqdse 
dlfkkivrasqeseariqqlksewqsmtqesfqessvkelrrle 
dqlaglqqelaalalkqssvaeevgllpqqiqavrddvesqfpa 

WISQFLARGGGGRVGLLQREEMQAQLRELESKIIjTHVAEMQGKS 



569 
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SEQ 
ID 
NO: 

- 


Predicted 

wcy n li t x i iy 

nucleotide 
location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Predicted end 
nuci eociue 
location 
corresponding 
to first 
amino acid 

LcSxQUc Ot 

amino acid 
sequence 


Amino acid segment containing signal peptide 
\A=Aianine f C=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
baberme, Ts»Tnreonme, v=* Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAASLS LTLQKEGVIGVTE EQ VHHIVKQALQRYSEDR IGLA 
DYALESGGASVISTRCSETYETKTALLSLFGIPLWYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAVVRLSARIRPTAVTLEHVPKALSP 
NSTISSAPKDFAIFGFDEDLQQEGTLLGKFTYDQDGEPIQTFHF 
QAPTMATYQWELRILTNWGHPEYTCIYRFRVHGEPAH 


6908 


3 


780 


QVPSAAWIiMAVCGLGSRLGI/jSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDEliSSAIGFArJELVTEKGHTFAEEIjQKIQCTLQDV 
GSALATPCSSAREAHLKYTTFKAGP I LELBQW IDKYTSQLPPLT 
AF ILPSGGKI SSALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDIiYGQRSS APEQEIiLVQDATPVSNS LLPEKAFSDI P 
. SPYLRGTIKMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCLYA 
LCVVDTIKRSSQTGEWQNIAIMTEEPELSPAYLISEAMRRSRMS 
LYC 


col n 


-L 


1068 

• 


LVPVWI DS YYYGKLVI AP LN I VL YN I FTPHG PDLYGTE PW YFY 
LINGFLNFNVAFALALLVLPLTSLMEYLLQRFHVQNIjGHPYWLT 
iiapmyi wf 1 1 ffiqphkeerflfpvypliclcgavalsalqhsf 
LYFQKCYHFWQRYRLEHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFLLPDNWQIjQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
NLEE PS RY I D I S KCH YLVDLDTMRET PREP KY S SNKE E W I S LAY 

RPFLDASRSSKLLRAFYVPFLSDQYTVYVNYTILKPRKAKQIRK 
KSGG 


0 7.ll 


JL J. O 4 




UifiDAE JiMETGNVANIi I S I FGSSFSGLLRKSPGGGREEEEGEESG 
PEAAE PGQ I CCDKP VLRDMNPWS TA I VAF 


6912 

- 


1 


844 

i 
* 


AMKP VETHS FQMLFT I LS TGSALKAQS YEDAYRC I KSS I LLGS I 
SGGTD 1 1 S CFMGHNFS L P VYKGE I QARNLGMAVEAWNEEGKAVW 
GESGELVCTKPIPCQPTHFWNDENGNKYRKAYFSKFPGIWAHGD 
YCRINPKTGGIVMLGRSDGTLNPNGVRFGSSEIYNIVESFEEVE 
DSLCVPQYNKYREER V I LFLKMASGHAFQPDLVKR IRDAIRMGL 
SARHVPSL I LETKGI P YTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NPETIiDLYRDIPELiQGF 


6913 


T £ A. 
J. O ~ J 


ices 


KKSHEE S HKE ELS YGAQAS LPLPCS D FR 


6914 


1251 

* 


615 


ELAAECKSAGYPGTLI PYRCDLSNEEDILSMFSAI RSQHSGVDI 
CINNAGLT^PDTLLSGSTSGWKIDMFNVNVIiALS ICTREAYQSMK 
CKM VLJULrJll IN XNbM&OHRViiPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHI RATCI S PG WETQFAFKLHDKDPEKAAAT YEQMK 
CLKPEDVAEAVIYVLSTPAHIQIGDIQMRPTEQVT 


6915 


254 




uKoiior iS.lr J_il w VLilol lyoolijMlGAJjVljr c»bEFVnVVAISrT 
ALI LTELLMVALTVRTWHWLMWAEFLSLGCYVS S LAFLNEYFD 
VAFITTVTFLWKVSAITWS CLPLYVLKYLRRKLS PPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIMVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLWALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAITWS CLPLYVLKYLRRKLS PPSYCKLAS 


6917 


254 


652 


GRSLS FKTFLI WVLIS I YQGGI LMYGALVLFESEFVHWAIS FT 
ALILTELLMVALTVRTWHWLMUVaFPT^T^PYVQ <? T,a FT ,NP VPH 

VAF ITTVTFLWKVSAIT WS CLPLYVLKYLRRKLS P PS YCKLAS 


6918 


28 


921 


PEAGTRSWREPDPEDLRRFLLSAACRSFPQWLPGGGGGQVSSCS 
DTDVP YLLLAVKS EPGRFAERQAVRETWGS PAPGIRLLFLLGS P 
VGEAGPDLDS LVAWESRRYS DLLLWDFLDVPFNQTLKDLLLLAW 
LGRH C PTVS F VLRAQDDAFVHT PALLAHLRAL P PAS ARS L YLGE 
VFTQAMPLRKPGG P FYVP ES F FEGG Y PAYASGGG YV I AGRLAP W 
LLRAAARVAP FP FEDVYTGLC I RALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS I RLWKQLQ0PRLQC 
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SEQ 
ID 
NO: 


Predicted 
■wey inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino an'H 
seauence 


Predicted end 
nucieuLiae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
lA=Aianme, C=cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, Ms=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W«Tryptophan, Y»Tyrosine, X=*Unknown, *=Stop 
Loaon, /=jpossiDie nucieociae deletion, 
\-pwo5iJjic nucieoLiue lnser tion/ 


6919 


850 


41 


QGRRELSGSVFCPFIQQEPKEMI1TL1SEYHERVRSQGQQI1QQLQA 
ELDKLHKEVSTVRAANSERVAKLVFQRLNEDFVRKPDYALSSVG 
AS IDLQKTSHD YADRNTAYFWNRFS FWNYARPPTVILEPHVFPG 
NCWAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 
RDFAVFFLIjS FFTHQGLiQVYDETEVSLGKFTFDVEKSE I QTFHL 
QNDP PAAFPKVKIQI LSNWGHPRFTCLYRVRAHGVRTSEGAEGS 
AQGPH 


6920 


1418 


591 


EAQG PS KVHLTLKKKK " 


6921 


2 

• 


1711 


MNATRSEEQFHVINHAEQTLRKMENYLKEKQLCDVLLIAGHLRI 
PAHRLVLSAVSDYFAAM FTNDVLEAKQEEVRMEGVDPNALNSLV 
QYAYTGVLQLKEDTIESLLAAACIiLQLTQVIDVCSNFLIKQLHP 
SNCLGIRSFGDAQGCTELLNVAHKYrMEHFIEVIKNQEFLLLPA 
ne i skllcsdd i nvpdeeti fhalmqwvghdvqnrqgelgmlls 
YIRLPL^PPQLLADIjETSSMFTGDLECQKLLMEAMKYHLLPERR 
SMMQS PRTKPRKSTVGALYAVGGMDAMKGTTT I EKYDLRTNSWL 
HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 
I WTVMPPMSTHRHGLGVATLEG PMYAVGGHDGWS YLNTVERWDP 
EGRQWNYVASMSTPRSTVGWAIiNNKLYAIGGRDGSSCLKSMEY 
r DPH I NKWS JjLAPMS KRRGGVGVATYNGFLY WGGHDAPASNHC 
SRLSDCVERYDPKGDSWSTVAPLSVPRDAVAVCPLGDKLYVVGG/ 
YIXjHTYIjNTVESYDAQRNEWKEEVPVNIGRAGACVVVVKLP 


6922 


1075 


369 


LTPPAGIRHEVRDREREREEiEREREKFPLDSTGSELKQNIHSIT:: 
GLPPAMQKA^YKGLAPEDKTLREIKVTSGAKIMGGGSTINDVLA'' 
VNTPKDAAQQDAKAEENKKEPLCROKQHRKVLDKGKPEDVMPSV . 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGS I KNWSE P I EGHED YHMMAFQLGPTEAS Y YWVYWVPTQY 
VDAIKDTVLGKWQYF 


6923 


2469 


1660 


LGLFCILPIDTLCAVLERDTLSIRESRDFGAVVRWAEAECQRQQ 
LPVTFGNKpKVIiGKALSLIRFPLMTIEEFAAGPAQSGILSDREV. 
VNLFLHFTVNPi^RVEYIDRPRCCIiRGKECCINRFQQVESRWGY 
SGTSDRIRFTVNRRIS I VGFGLYGS IHGPTDYQVNIQI IEYEKK 
QTLGQNDTGFSCEX3TANTFRVMFKEPIEILPNVCYTACATLKGP 
DSHYGTKGLKKVVHETPAASKTVFFFFSSPGNOTGTSIEDGQIP 


6924 


• 2210 

* 


1235 


PEERVI CFVB YYLTAFHEGRKGALAKKP YNPI IGETFHCS WEVP 
KDRVKPKRTASRSPASCHEHPMADDPSKSYKIiRFVAEQVSHHPP 
I S C FYCE CEE KRLC VNTHVWTKS KFMGMS VGVSM I GEGVLRLLE 
HGEEYVFTLPS AYARS I LT I PWVBLGGKVS INCAKTGYSAT VI F 
HTKPFYGGKVHRVTAEV1CHNPTNTIVCKAHGEWNGTLEFTYNNG 
ETKVIDTTTLPVYPKKIRPIiEKQGPMESRNLWREVTRYLRLGDI 
DAATEQKRHLEEKQRVE ERKRENLRTP WKPKYF I QEGDGSG I LQ 
SPLESTLMGLEVQSFPV 


6925 


2 


1653 


RGGAAGAAMEPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVEHLISRMCHYQHGHINSYLKPMLQRDFITALP 
EQGLDHIAENILSYIJ^SL(^^LVCKEWQRVISEGMLWKKLI 
CiKJrl V K 1 U V Jj W aLjJjS ERRG WDQ YLr KKRPTDGPPNSFYRSLYPKI 
IQD I ETIESNWRCGRHNLOR IOCRS ENS KGVYCLOYDDEKI I SG 
LRDNS I KI WDKTS LECLKVLTGHTGS VLCLQYDERVI VTGS SDS 
WRWDVNTCE Vl^TLIHHNEAVLHIiRFSNGLMOT KDRS IAV 
WDMAS ATDITLRRVLVGHRAAVNWDFDDKYI VSASGDRTI KVW 
STSTCEFVRTLNGHKRGIACLQYRDRLWSGSSDNTIRLWDIEC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKI KVWDLQAALDP 
RAPAS TLCLRTLVEHSGRVFRLQFDEFQ 1 1 SSSHDDTI L IWDFL 
NVPP S AQNETRS PS RTYTY I SR 


6926 


1 


733 


SGRVAMDGLGLQFPEQGFPAGPPLiLPPHMGGHYRDCQSLGAPPL 
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SEQ 
ID 
NO: 


Predicbed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine. T=Threonine V«Valin*» 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGYPLPTPDTSPIiDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPS I PGLLAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAISSWSDASSAVYYCNYPDV 


6927 


2 


1484 


LTLCGDIQLMLAQNANNRAAHLEEFHYQTKEDQEILHSLHRESS 
CQGFAWATDLSTDLESQLSVSCKCYEAANEILQFRDLKSQNPEH 
YVQVLKRMGNIRNE IGVFYMNQAAALQSERLVSKS VSAAEQQLW 
tvxvor our dr^saxriric CD ±JilJMlI\>i/LijijLA-iN aOJK±jMK1(JAUAH(_GA 
GDELKRE FS P E EGL Y YNKAI D YYLKALRS LGTRD I HPAVWDS VN 
WELSTTYFTMATLQQDYAPLSRKAQEQIEKEVSEAMMKSLKYCD 

VLADLHYSKAAKLFQLLKDAPCELLRVQLERVAFAEFQMTSQNS 

NVGKLKTLSGALDIMVRTEHAFQIiIQKELIEEFGQPKSGDAAAA 

ADASPSLNREEVMKLLSIFESRLSFLIjIjQSIKLLSSTKKKTSNN 

IEDDTILKTNKHIYSQLLRATANKTATLLERINVIVHLLGQLAA 
GSAASSNAVQ 


6928 


1086 


777 


EAIDLINNLLQVKMRKRYSVDKTLSHPWLQDYQTWLDLRELECK 
IGERYITHESDDLRWEKYAGEQGLQYPTHLINPSASHSDTPETE 
ETEMKALGERVSIL 


.6929 

■ 


1749 


607 


RnORfiVPTTTlP QQ&DR Dni^\rC 7\ DTT3 crr/IWno ATTn^nn tit tt>Vt/-» 

c\u \jis. vj x K_uiJK.o i'Aicii JJvoJ4K.lKooo\jo<aKibAl 1 AMPPPVPNG 
NLHQHDPQDLRHNGNWVAGRPSCSRGPRRAIQKPQPAGGRRSG 
RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNS TSAQKNERES I RQKLALGS FFDDGPG I YTS C 

TPLSPMSKQSSSYSDRDTTEEESESLDDMDFLTRQKKLQAEAKM 
ALAMAKPMAKMQVEVEKQNRKKSPVADIjLPHMPHI seclmkrsl 
KPTDLRDMTIGQIjQVIVNDLHSQIESLNEELVQLLLIRDELHTE 

qdamlvdiedltrhaesqqkhmaekmpak - 


6930 


131 


545 


FKDTAWVFV^TiFnMPNWFPWVFTF'DCr*T..VT wm7T'nii7TW«?rktr> t " " 

SYTWPFVLLSIKPSLTFYSSWYYCUIILGILVLLLLPVKKTQR 
RKNTHENIQLS QS KKFDEGENSLGQNS FS TTNNVCNQNQE I ASR 
HSSLKQ 


6931 


2 


659 


FVERLPNRPACLLVASGAAEGVSAQS FLHCFTMASTAFNLQVAT 
PGGKAME FVDVTE SNARWVQD FRLKAYAS PAKLES I DGARYHAL 
L I PSCPGALTDLAS SGSLAR I LQHFHSES KPI CAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFSAS EPDAVHVVLDRHLVTGQMASS TVPAVQNLLFLCGSRK 


6932 


2 


1131 


FVDSPGOGEOAFFFFflfiTnMWQPMDaTJCDiiFnac^ " 
SDMCEGCRSLAAGHPGYISHDKETSI KYVSHQHPSHPQLFS I VR 
QACVRSLSCEVCPGREGPIFFGDEQHGFVFSHTFFIKDSLARGF 
QR WYS 1 1 TIMMDRI YLINS WPFLLGKVRG I IDELQGKALKVFEA 

EQFGCPQRAQRMNTAFTPFLHQRNGNAARSIjrSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTEDTLVOI^KTiAni.FFFSFc: 

WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHEI^RRANHGLCLPTRLASGPS TL 
KTIiQEVTDSLLGGWLMAQGVGGI I 


6933 


1431 


890 


SIjNLiHCTLPPPPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIREDCQNQKLW 
DEVLSHLVEGPNFLKKLEQSFMCVCCQELVYQPVTTECFHNVCK 
DCLQRS FKAQVFSCPACRHDLGQNYIMI PNEILQTLLDLFFPGY 
SKGR 


6934 


3030 


2588 


DRDHSQCGG IRRVALARVSSVKL I SKAKI RTVKMTFI I VLAF I V 
CWTPFFFVQMWSVWDANAPKBASAFI IVMLLASIiNSCCNPWIYM 
LFTGHLFHELVQRFLCCSAS YLKGRRLGETSAS KKSNSS S FVLS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aapartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leuc ine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 


6935 


886 


543 


NSALWAGGNDGTSCLNSVERYSPBCAGAWESVAPMNIRRSTHDIj 
VAMDGWLYAVGGNDGS SSLNS I EKYNPRTNKWVAASCMFTRRS S 
VGVAVLELLNFPPPSSPTLSVSSTSIi 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSIiNVGLHYSHIPFLT 
TCLHFLRKRI^KGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRALLQLRGLDPS 
LPSPLPNLGPQGPALTPEQENIIiHTTQTDClfNNLAACLLQMEPV 
NYERWEYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYLL 
AAVNRQ P KDANVRR YLQLTQS ELS S YHRKE KQLYLGM FG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPARPCFVGEWS PWSGCADQCKPTTRVRRRSVQQEPQNGGA 
P CPPLEERAGCLEYS TPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 

i 


NSRKLELAERVDTDFMQLKKRRQSSEKENDSGTLDTVGAVWDH 
EGNVAAAVS SGGLALKHPGRVGQAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTI LARECSHALQAEDAHQALLETMQNKFI S S 
PFLASEDGVLGGVIVLRSCRCSAEPDSSQNKQTLLVEFLWSHTT 
ESMCVGYMSAQIXSKAKTHISRLPPGAVAGQSVAIEGGVCRIjGEP^-- 
SELTLQAECEASQRHFRT. 


6939 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS 

G YE S LRRDS EATGS AS S APDSMS E SGAAS PGARXRS LKS P KKRA^ 

TGLQRRRLIPAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 

E I KVYE I DDVERLQR PRPTPREAPTQGLACVSTRLRI*AERRQQR 

LREVQAKHKHLCEELAETQGRLMLEPGRWLEQFEVDPELEPESA 

EYLAMjERATAAI^QCWLOCAHVMMVTCFDISVAASAAIPGPQ 

EVDV 


6940 


1188: 

■ * ■ 

1 

* 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 1 
VVKGSSPI^PAGLGAEEPAAGPQLPSWLQPERCAVFQCAQCHAV 
LAD S VHLAWD LS RS LG A WFS R VTNNWL E AP FL VG I EGSLKGS 
T YNLLFCGS CG I PVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELKEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI IGSNVLALAEAQRQAEALGYQA 
WLSAAMQGDVKSf4AQFYGLLAHVARTRLTPSMAGASVEEDAQL 
HELAAELQ I PDLQLEE ALETMAWGRGPVCLLAGGEPTVQLQGSG 
RGGRNQE LALRVG AE LRR WP LGP IDVLFLSGGTDGQDGPTEAAG 
AWVTPELASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRLYADGGYDG 
QTYLNTMESYDPQTNEWTQMASLNIGRAGACVWIKQP 


6943 


1 


739 


PMATGDGAKTLA I HVKALTADS IRITW KATLP AS S FRLSWLRIiG 
HS PAGGS ITETLVQGDKTE YLLTALEPKPTY 1 1 CMVTMETTNAY 
VADETP VCAKAETADS YGPTTTLNQEQNAGPMASLPLAGI IGGA 
VTUiVFLFLVLGAICWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILE IRGPGLQMLPINPYRAKEEYWHTI FPSNGSSLCK 
ATHTIGYGTTRGYRDGGIPDIDYSYT 


6944 


960 ' 

* 


156 


VANILLNGVKYESELTGSSERAEQPLSVGRLCSTICNMPKALRT 
LCVKHFLGWLSFEGMIiLFYTDFMGEWFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCIYAFSAAFYSAILEKLEEFLSVRTLYFIAYLA 
FGLGTGLATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVDISLLSCQYFLAQILVSLVLGPLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEBHRPLL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
3Decrinni tier 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Aianme, u=vjysteine, D=>Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6945 


2067 


179 


EGEDRGLPRTMQAALGTGTRIiAPWPGRACGALPRWTPTAPAQGC 

HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLEERLQLGIHGL 

IPPCFLSQDVQLLRIMRYYERQQSDLDKYlIIiMTLQDRNEKLFY 

RVLTS DVEKFMP I VYT PTVGLACQHYGLTFRRPRGLFITIHDKG 

HLATMIjNSWPEDNIKAVWTDGERILGLGDLGCYGMGIPVGKLA 

LYTACGGVNPQQCLPVLLDVGTNWEELLRDPLYIGLKHQRVHGK 

AYDDLLDEFMQAVTDKFGINCLIQFEDFANAN^ 

CM FNDD I QGTAS VAVAG I LAALR I TKNKLSNHVFG FQGAGEAAM 

G \ IAHLLVMALE\ KEGVPKA\EATRKI W\MVDF \KGL I VQGRDH 

LNHEKEMFAQD\HPEVNSLEEWRIjVKPTAIIGVAAIAEA\FTE 

Q I LRDMAS FHERP\ 1 1 FALSNPTS KAECTA\EKCYRVTEG PRGF 

FAS\GSPF*GVLIWEMGKTFIPGGRGNNA*RVPRGWQIjGVHSPG 

gdpghip\deiflpdsraklpqevseqhlsqgrlyp\plst\ir 
nvflriaikvfd*gykhnlv\syypepkd\keafckipgsytpd 

yds fyt/vds Y I WAQGKAMNVQTV 


6946 


133 


2551 


SCEYSGITVAPGDPCPGVAHLLAPSMASDTPESLMALCTDFCIjR 
NLDGTLGYLLDKETLRLHPDI FLPSE I \ CDRLVNE YVE LVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\I»VQD\QD\LE 
AIRKQDL\VEL\YLTN\CEKLSAKSX*QTLRSFSHTLGVP*AFFG 
C \TNILLLRKENPGGL/ CEDE YLFNPTCQVLVKDFTFEGFSRLR 
F\LKLGRMIDWVPVES\LLRPLNSLAALDLSGIQTSDAA\FLTQ 
WKDSL\VSLVL\YNMDLSDDHIRNVIVQIjHKIiRHLDISRDRLSS 
YYKFKLTREVLSLFVQKLGNIiMSLDISG\HMILENCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
I PAY KVS GDKNEEQ VLNAI EAYTEHRPEITSRAINLLFDIARIE 
RCNQLLRALKLVITALKCHKYDRNIQVTGSAALFYLTNSEYRSE 
QS VKLRRQVI Q WLNGMES YQEVTVQRNCCLTLCNFS IPEELEF 
QYRRVNELLLS ILNPTRQDES IQRIAVHLCNALVCQVDNDHKEA 
VGKMGFWTMLKLTQKKLLDKTCDQVMEFSW\ SAXjWNITDETPD 
NCEMFLNFNGMKLFLDCLNE FPEKQELHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLESKAIX3IEVSYNACX3VLSHIMFDGPEA 
WGVCEPQREEVEERMWAAIQSWDINSRRNINYRSFEPILRLIiPQ 
GI S P VSQHWAT WAL YNLVS VYPDKYCPLLI KEGGM PIiLRD 1 1 KM 
ATARQETKEMARKVIEHCSNFKEENMDTSR 


6947 


2 


1682 

• 


TSVST I PRGLASARPQSRS WRCCPVWRRS PGRARGRGLKMLNVP 
' SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVHRWVN YESMLKECLVGRMAI KPAVLK 
iJ 1 KiSEEKKVljNGMijPKSQVTDTLiAKEGPS YPS YDWFQTDS LVT I 
/EHIY*TEGYQFRLNNS*SSE*FLYSRNI^Y*GLLISYTYW/R*A 
MRFRKIFLCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 
IPRKDTGLYYRKCQIjISKEDVTHDTRLFCLMIjPPSTHLQVPIGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQELEDLFLLA 
AGTGFTPMVK I LN YALTD IPS LRKVKLMFFNKTEDD 1 1 WRSQLE 
xvUAr xUJ is-KiiLJ V a r V Jjo Air 1 b sl vfri\j J\y Ljxi 1 0 f Ai-tijo E r Jj KRNLtDK 
SPCVLVCI CGPVPFTEQGVRLLHDLNFSKNE IHS FTA 


6948 


104 


58 


PDGAHSFFPDEYFTCSSLCLSCGVGCKKSMNHGKEGVPHEAKSR 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGIiAKY 
AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVS ELSLGPTKAVTS WLTDQ I 
APAYWRPNSQILSCNKCATSFKDNDTKHHCRACGEGFCDSCSSK 
TRP VPE RGWG PAP VRVCDNC YEAR/ TRP VS C YRGTSGR * RRRRT 
QETVE 


6949 


152 


4656 


GLRLCL S RPLTRPGDDS VGGSAMASGAGGVGGGGGG K I RTRRCH 
QGPIKPYQQGRQQHQGILSRVTESVKNIVPGWLQRYFNKNEDVC 
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SEQ 
ID 
NO: 

* 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AeAlanine, C=Cysteine. D=Asnartie Acid E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








SCSTDTSEVPRWPENKEDHLVYADEESSNITDGRITPEPAVSNT ' 

EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGFSSRASDKDIT 

VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 

SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 

PVRRQMKAKQLSAQS YGVTS STARRI LQSLEKMSSPLADAKRI P 

SIVSSPLNSPLDRSGIDITDFQAKREKVDSQYPPVQRLMTPKPV 

S IATNRSVYFKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 

QALTNKVQMTS PSSTGS PMFKFSS P I VKSTEANVLPPS S IGFTF 

SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 

EGPFRPAE I LKEGSVLD I LKSPGFAS P KIDS VAAQPTATS P WY 

TRPAISSFSSSGIGFGESIiKAGSSWQCDTCLLQNKVTDNKCIAC 

QAAKLSPRDTAKQTGIBTPNKSGKTTLSASGTGFGDKFKPVIGT 

WDCDTCLVQNKPEAIKCVACETPKPGTCVKRALTL'nA/SESAET 

MTASSSSCTVTTGlliGFGDKFKRPIGSWECSVCCV'SNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS" 

SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP • 

MSEGF* FSKHI VGFKFGVSSESKPEEVKKDSKNDNFKFGLS FGL 

SNPWLTPFQFGVSNLGQEEKKEELLKSSCAGFRFGTGVINSTR 

VPANT IVTSENKSS FNLGTI ETKSVS VAPLKCQTSEAKKEEMPA 

TKGGFSFGNVE PAS LPS AS VFVLGRTEE KQQE PVTS TSLVFGEG ' 

KLTMKEPKC\ OPVPSFRKPORnTRTTRNQ d VQTWQ P^MTlfDCP VTP 
S EQPAKATFAFGAQTNTTADQGAAKPDLS YLNNSS S S SSTPATS 
AGGG\ IFGSSTSSSNPPVATFVFGQSSNPGSSS \AFGNTAESST 
SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGANQTPTFGQSQGASQP 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
SAFGSGTTPNSSSAFQFGSSTTNFNFTNNSPSGVFTFGANSSTP 
AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 


6950 


2585 

• 


411 


prpgsrsglcrragergavragglsrrtrae * imdelhyqdtds 
dvpeqrds kckvkwtheedeqlralvrq fgqqdwkflashfpnr 
tdqqcqyrwlrvlnpdlvkgpwtkeedqkvielvkkygtkqwtl 
iakhlkgrlgkqcrerwhnhlnpevkkscwteeedri iceahkv 
lgnrwae iakmlpgrtdnavknh1vnsti krkvdtggfls eskdc 
kppvyllleledkdglqsaqptegqgslltnwpsvpptikeeen 
seeelaaattskeqepigtdldavrtpepleefpkredqegspp 
etslpykjfvveaanllipavgsslsealdtjiesdpdawcdlskf 
dlpeepsaedsinnslvoloashooovlppropsaVlvpsvthy 

rldghtisdlsrssrgelipispstevggsgigtppsvlkrqrk 

rrvalspvtenstsls FLDSCNSLTPKSTPVKTLPFS PSQFLNF 
WNKQDTLELESPSLTSTPVCSQKVVVTTPLHRDKTPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEBDLKE 
VLRSEAGIEIiIIEDDIRPEKQKRKPGIiRRSPIKKVRKSLALDIV 
DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 
LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 
KARQLLGRLKPSHTSRTLI LS 


6951 


1940 


239 


AGPDDTMKRSLQALYCQLIiSFLLILAliTEALAFAIQEPSPRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 
TS S KPEGRPRGQAAPT I LLTKPPGATS RPTTAP PRTTTRRPPRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKIFQIYKGMFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A.— Alanine . CToCv^t"f»'fr»» n-aonarf*-! «-» Ani^ c 

Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I.ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

V^OOSsible nucleot* l H(» inqprt-innl 








TVPSNTSWAPTTTSLGPAKDKPGIiRRAAQGGGSTFTSQGGTPDA 

TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 
SRPLSTSSGVFTAATGPTP2V2iFnTQ\7Q2iDQrV3TDnr'aCTTDr»aD 

THPSRVSESTISGAKEErVA\PSP*PTGCPVLSPQWYPQPQAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 
WPSACPS PP \LCPADGVLJHEEEEEDRQPGEQ PEAYGNNTHHPGT 
TFQQAC\RGAAPGE IPVPLKPLRTQLSEPRS PANGDYRDTGMVP 
C 


6952 


658 


304 


PESEGESGEMTDRYTIHSQLEHLQSKYIGT\ATPTPPSGSG\CE 
PTPRIjVLLLHGPLRPSQLIiRHCGE*EQSASPIiQLDGKDASALWT 
ASRQARGELRLCLTTAVRGTSPSVS PVCQSS 


6953 


1512 


349 


FSPHFRGKMGGW\KLEKELENTEQPVGGNEG*EHEVTGNLNSD 

PXiLEIiCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 

GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 

EVIiNMESLPTVHNEGPSSAEGKDIAFSPPWPAGILLVCNNCAA 

YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 

ETPEEREVRRMPJ3REAKRLQRMQETDEQRARRLQRDREAMRLKR 
AIETPEKROARLIRERKAIC RT .TfPP T .P vmtimmt .P anw2nno a bms 

ALAAEMNFFQLPVSGVELDSQLLGKMAFEEQNSSSLH 


6954 


819 


1 


PPPPFIIPSHPREAGT*AG*KRSGDSECSPPVEQ*A*TRAAAQN 
*PQR*RWTEGNS PQASAVATPGQGASPAAPRCTP * PSRRHRRLP 
PGARPPAG* AAPAPTKPWLAGPASAPQPGAAPLS PPAP PL IRTR 
* CAGAAARGRPRRDRS PRPRTPGGCS WSEPRTPPAVSASAQTPS 
DAG* AGGR*GQRQRPSTGR+ PPGVGGAGRSHRREGTI PGNPHPR 

AS*RAGWQR*PGP/REWGL*EPQGEEMSGPGGPGGAPPNQVGSS 
VMQAMSTGI 


6955 


1968 


782 


ppgrrqvraqvagapvghwgtrarqvktggrrrarr™pflgqd 

WP.G Pf^WQWT TfT'PTViWVT? f^T? CTCr>YT on tptvtktlt/ - *tvt t nun t tt vrr»T-ir\ 
rt t^o r vj w o « j. j\ i ri w &Kv* a o Lb LJ-tvli r. K liiNWrlV-Ni&HbllLiKSED 

GE IFNNEEHE YASKKRKKDHFRNDTNTQS FYREKW I YVHKESTK 

ERHGYCTLGEAFNRIjDFSSAIQDIRRFNYWKLLQLIAKSQLTS 

IjSGVAQKNYFNI ldkivqkvlddhhnprl ikdllqdlsstlci l 
/n*rsrevcisgkhqyldlpirnysrlattatgssdd*ase\ng 
ltlsdlplhmlnnxlyrfsdgwdiitlgqvtptlymlsedrqlw 
kklcqyhfaekqfcrhlilsekghiewklmyfalqkhypakeqy 
gdtlhfcrhcsilfwkdsghpctaadpdscftpvspqhfidlfk 

F 


6956 


8605 


3839 


QTSTS I FAS PTS PPVLGES VLQDNSFDLNNGSDAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 
TS PKAS PVTSPAAAFPTAS PANKDVSS FLETTADVEE I TGEGLT 
ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 
YYGPCGKRMKQFPEVIKYLSRNVVHSVRREHFSFSPRMPVGDFF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
PKVKRGRGRPPKVKITELLNKTDWRPLKKLEAQETLNEEDKAKI 
AKS KKKMRQKVQRGE CQTT IQGQARNKRKQETKSLKQKEAKKKS 
KAEKEKGKTKQEKLIOBKVKREKKEKVKMKEKEEVTKAKPACKAD 
KTLATQRRLEERQRQQMILEEMKKPrEDMCLTDHQPLPDFSRVP 
GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 
LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLKILGEKVSEI 
PLTRDNVSEILRCFLMAYGVEPALCDRLRTQPFQAQPPQQKAAV 
LAFLVHELNGSTLI INE IDKTLESMS S YRKNKWI VEGRLRRLKT 
VIAKRTGRSEVEMEGPEECLGRRRS SRIMEVTSGMEEEEEEES I 
AAVPGRRGRRDGEVDATASS I PELERQ I EKLS KRQLFFRKKLLH 
SSQMLRAVSLGQDRYRRRYWVLPYLAGIFVEGTEGNLVPEEVIK 
KETDSLKVAAHASLNPALFSMKMELAGSNTTASS PARARGRPRK 
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SEQ 

ID 
NO: 


Pre die bed 
oeginnxng 
nucleotide 
location 
corresponding 
to first 
amino acid ' 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alamne, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hist idine , 1=1 soleucine , K=Lysine , 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=*Arginine, 
S=sSerine, T=Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«pbssible nucleotide insertion) 




• 

* 




TKPGSMQPRHLKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQLQ 
LQLQSHKGFIiEQEGSPLSLGQSQHDLSQSAFLSWLSQTQSHSSL 
LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPT PP PAVSEDQPTPS PQQLASS KPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQ PVPPEMCSGWWWI RDPEMLDAMLKALHPR 
GIREKALHKHLNKHRDFLQEVCLRPSADPIFEPRQLPAFQEGIM 
SWSPKEKTYETDLAVIiQWVEELEQRVlMSDLQIRGWTCPSPDST 
REDLAYCEHLSDSQED I T WRGRGREGLAPQRKTTOPLDLAVMRL 
AALEQNVERRYLRE PLWPTHE VVLE KALLSTPNGAPEGTTTE I S 
YEITPRIRWRQTLERCRSAAQVCLCLGQLERSIAWEKSVNKVT 
CLVCRKGDNDE FLLLCDGCDRGCHI YCHRPKMEAVPEGDWFCTV 
CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVIjLR 
GRE S PAAGPRYS EEGLS PSKRRRLS MRNHHSDLTFCE 1 1 LMEME 
SHDAAWPFLEPVNPRLVSGYRR 1 1 KNPMDFSTMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 

YQGKQGQSVRQGRWGVTLWHLPPTFQTKTCHFHLLMLPWVQTQV 
RYNPDF 


6957 


82 

i 


3514 

* 


HLI VAMPE PTKKEENE VPAPAP P PEE PSKEKEAGTTPAKDWTLV 

ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 

EDLSEKPTINGSRKWMDLASKAGKHLQLKETFERHSRVYTFEMQ 

1 1 KAKDNFAGNYRCEVTYKDKFDS CS FDLEVHESTGTTPN I DIR ; 

SAFKRSGEGQEDAGEIiDFSGIiLKRRJBVKQQEEEPQVDVWELLKN*" 

TKPSEYEKIAFQYESPTCSGMLKRLKRSIREEKKSAAFAKILDP 

VYQVDKGGRVRFWELADPKIjEVKWNKNGQELRPSTKYIFEDTR 

CQS I LNIDNCQMTDDSE YYVTAGDEKCSTELLVREPPIMVTKQL 

EDTTDYCGERVE LECEVS EDDAQVKWFKNGEE I 1 LVQTRYR I RV 

EGKKHILIIEGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 

PLTDQTVNLGKE I CLKCE I S ENI PGKWTKNGLPVQESDRLKWH 

KGRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHV1DPPKI I 

LDGLDMNTVTVI AGNKLRLiE I P I SGE PPPKAMWS RGDKAI MEG 

SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 

VKVVDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 

RKKKQSSRWMRLNFDLCKETTFEPKKM IEGVAYEVRIFAVNA\ I 

GISKPSMPSRPFVPLAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 

GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDLPAEDWIVANKD 

LIDKTKFTTTGLPTDAKI FVRVKAVNAAGASEPKYYSQPILVKE 

IIEPPKIHSPKHLKQTYIRRVGDRVILVIPFQGKPRPELTWKKD 

GAEIDKNQINIRNSETDTIIFIEiKAERSHSGKYDLQVKVDKFVE 

TASIDIRIIDRPGPPQIVKIEDVWGRNVALTWTPPKDDGNAAIT 

GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 

MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 

VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 

NQGV CTLE I RKPS P YDGGT YCCKAVNDLGTVE I ECKLEVKVIAQ 


6958 


274 


1663 


PRTSRVKTEGSQGSSAMDFSVKVDIEKEVTCPICLELLTEPIiSL 
DCGHS FCQACITAKI KES VI I SRGES S CPVCQTRFQPGNLRPNR 
HLANI VERVKEVKMS PQEGQKRDVCEHHGKKLQIFCKEDGKVI C 
ft v L^dOjij y Cinyutiy IrKJLWlSVV Jvb Lva xsJjy V ALiQ KJj 1 KbNQKAE K 
LEDDI RQERTAWKNY1Q IBRQKI LKGFNEMRVI LDNEEQRBLQK 
LEEGEVNVLDNLAAATDQLVQQRQDAS TL ISDLQRRLRGSSVEM 
LQDV1DVMKRSESWTLKKPKSVSKKLKSVFRVPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLNK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\t>PWLGFS 


6959 


1 


1469 


SLVHWEFGRGIEDFPYLFFQLTHCQQRICSVTQAGVQWCDHSS 
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SEQ 
ID 
NO: 


Predicted 

j 4,111 1JL11M 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

iiuu leu u xuc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
ihaftiaaine, ^=Lysceine f u— Aspartxc ACict, tl= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLS S FNEWFWQDRFWLPPNVT 
WTELEDRDGRVYPHPQDLLAALPLiALVLLAMRLAFERFIGLPLS 
RWLjGVRDQTRRQVJ^NAIX»EKH 

TLQQTQRWFRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGLSV 
LYHESWLWAPVMCWDRYPNQLTLS CPAADSEA\SLYWWYIiLBLG 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFS YSANLLRIGSLVLLLHDS SDYLLEACKMVNYMQ 
YQQVCDALFLI FSFVFFYTRLVLF PTQ I LYTTYYES I SNRGPFF 
GYYFFNGIiLMLLQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWAREKEMQEF\TRSFF\RGRPDLSTLTHSIVRRRYLAHSGRS 

HLEPEEKQALKRLVEEEPliKMQVDEAASREDKLDLTKKGKRPPT 

PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTHP 

KEENPRRA\SKAVEESSDEERQRDLPAQRGEESSEEEEKGYKGK 

TRlQxPVVKKQAPGKASVSRKQAREESE . 

KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 

PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 

DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 

KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 

SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 

ACGAHRNYKKLIiGSCCSHKERLS I LRAELEALGMKGTPSLGKCR 

ALKEQREEAAEVASLDVANI ISGSGRPRRRTAWNPLGEAAPPGE 

LYRRTLDSDEERPRPAPPDWSHMRGI ISSDGESN 


6961 


340 


: 1646 

* 


RPWSSPTMKPNFSLRLRI FNLNCWGI PYLSKHRADRMRRLGDFL 
NQ E S FDLALLE E VWS E QDFQ YLRQKLS PTY P AAHHFRSG 1 1 GSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLIiVliHL 
SGMVL^YVTHLHAEYNRQKDIYIiAHRVAQAWELAQFIHHTS^ 
ADVVLLCGDLi^HPEDLGCCIiLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
LALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


RPWS S PTMKPNPSLRLR I FNLNCWG I P YLSKHRADRMRRLGDFL 
NQES FDLALLEE VWSEQDFQYLRQKLSPTYPAAHHFRSGI IGSG 
LCVFS KHP IQELTQHI YTLNGYP YM IHHGDWFSGKAVGLLVIiHL 
SGMVLNAYVTHLHAEYNRQKDI YLAHRVAQAWEIiAQF I HHTS KK 
ADVVLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKP FP FGVR IDYVLYKAVSGFYI SCKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
IALLCVIiAAGGGAGEAAILIjWTPSVGLVLWAGAFYIjFHVQEVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6963 

• 


374 


2618 

* 


RVTPLILKLLKKPKTAENQKASEENEITQPGGSSAKPGLPCLNF 
EAVIiSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTERIHSIN 
LHNFSNSVLETLNEQRNRGHFCDVTVRIHGSMLRAQRCVIiAAGS 
pffodklllgysdi e ips ws vosvo kl idfmysgvlrvsos ea 
LQILTAASILQIKTVIDECTRIVSQNVGDVFPGIQDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMERYL 
STTPETTHCRKQPRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNESEECTEDTDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFGPGAARDSQAEPTQPEQAAEAPAEGGPQTNQIiETGASSPE 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
anu.no acia 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=niamne f LsLystcine; u=A3parcic Acia, js= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

C_Cq >* \ via T— Thvonn < ino 1 i na 

c>=oei me , 1 — iiixtrcjii-Liifc: , v-vdiincj 
W-Tryptophan, Y= Tyro sine, X=Unknown, *»Stop 

LOvAVJll, /"J^WoaiUXC IIUUXCULXUC UC1CI.1UU/ 

\=possible nucleotide insertion) 


■ 






PFLFSLPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH 
STASGQGEKKP YECTLCNKT FTAKQNYVKHMFVHTGEKPHQCS I 
CWRSFSLKDYLIK\HMVTHTGVRAYQCSICZNKRFTQKSSIiNVHM 

GARAGPPGWACTEGTTYVCSVCPAKFDQIEQFNDHMRMHVSDG 


6964 


1 


178 ' 


SGRP FFFFFSNTDVYF I KKVTNRWTAGSSYKMTRMKS IGKI LLL 
QIFIG\NCSMFVLVI 


6965 


757 


208 


NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAALEVCSCGS 
SGSLGYNLPQNH\GLLGRWTLVLLGQMRRISPFLCLKDRSDFRF 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVIEGSTLALRRY 

T?if\T7 PTC? it*t T? 

FQoSXoTLjE 


6966 . 

> 


820 

• 


1867 


IITALGVRGMPGCPCPGCGMAGPPIiLFLTAIiALELIjGRAGGSQP , 
ALRSRGTATACRIjDNKES ES WGALLSGERLDTWI CSLLGSIjMVG 
LSGVFPLLVIPLEMGTMLRSEAGAWRLKQLLSFAIiGGLLGNVFL 

/hvpgqqgggdqpgpqqrphcccrraqwrplsgpagcrarprcr 
gp\dikvsgylnllantidnfthglavaasflvskkigllttma, 
illheiphevgdfaillragfdrwsaaklqlstalggllgagfa 
ictqspkgveetaawvlpftsggflyialvnvlpdlleeedpw 


6967 


162 


633 


gflpfkywildlsassrmetdcnpmelssmsgfeegselngfeg 
tdmkdmrleaeawndvl favnnmfvs kslrcaddvay invetk ■ j; 
ernryclelteaglkwgyafdqvddhlqtpyhetvyslldtl\" : - 
spayreafgkr\llqrlealkrdgqs 


6968 . 

< 
* 

• 


1 

• 

■ 

r 


2265 


rgggggrggpgarererpgepertmeaaaggrgcfqphpglqkt 
leqfhlssmsslggpaafsarwaqeaykkesakeagaaavpapv ; 
paatepppvlhlpaiqppppvlpgpffmpsdrstercetvlege 

TISCFVVGGBKRLCLPQILNSVLIIDFSLQQINAVCDEIjHIYCSR 

ctadqleilkvmgilpfsapscglitktdaerlcnallyggayp 
ppckkelaaslalglelsersvrvyhe\cfgkckgl\lvpelys : 
spsaaciqcld\crlmypphkpvvhshkalenrtchwgf\dsa^ 
nwrayillsqdytgkeeqarlgr\clddvkekfdygnkykrrvp [ 
rvsseppasirpktddtssqspapsekdkpsswlrtlagssnks . 

LG CVHPRQRLS AFRPWS PAVS AS EKELiS PHL PAL I RDS FYSY KS 
FE TAVAPNVALAP P AQQ KWS SP PGAAAVSRAP E PLATCTQP RK 
RKLTVDTPGAPETLAPVAA? EEDKDSEAEVEVE SREEFTS S LSS [ 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL j 
EHLRQALEGGLDTKEAKE KFLHEWKMRVKQEEKLSAALQAKRS 
LHQELEFLRVAKKEKLREATEAKRNLRKEIERX<RAEISrEKKMKE^ 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QH7^EADREQLRADLLREREAREHLEK\WK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKLKLYQSATQAVFQKRQA 
GELDESVLELTSQILGANPDFATLWNCRREVLQQliETQKSPEEL 
AALVKAELGFLESCLRVNPKSYGTWHHRCWLLGRLPEPNWTREL 
ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEEIiAFTDSLITR 
NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDOS AW F YHR W L LGRAD PODALRCLHVSRDEACLTVS F 
SRPLLVGSRME I LLLMVDDS PLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 
QLFRCELSVEKSTVLQSELESCKELQELEPENKWCL\DTIILLM 
RALDPLLYEKETLQYFQTLK\AWDPKRATY\LDDLRSKFLliENS 
VLKMEYAEVRVLHLAHKDLTVI/:HLEQIiLLVTHLDLSHNRLRTL 
PP7VLAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRLQELLL 
CIWRLQQPAVLQPLASCPRLVTjLNLQGNPLCQAVGI LEQLAELL 
PSVSSVLT 
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SEQ 
ID 
NO: 


Predicted 
Deginnrng 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SFPPLLSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQLBPLNE 
GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEDEVEILGPFPA 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 
LRRTYNPDDYFRKFEPHLYSLDSWSDDVDSLTDEEILSKYQIiGM 
LHFSTQYDLLHKHLTVRVIEARDLPPPISHDGSRQDMAHSNPYV 
KI CLLPDQKNS KQTGVKRKTQKPVFEERYTFE I PFLEAQRRTLIi 
LTVVDFDKFSRHCVIGKVSVPLCEVDtiVKGGHWWKALIPSSQNE 
VELGELLLSLNYLPSAGRLNVD VI RAKQLLQTDVSQGSDP FVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHNMKSSNDFIGRIVTG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVS PASLEVT 


6971 


37 

* » 


3 702 


ACFYVPGSRS FKLI PRHGLVNMGRSGKLPSGVSAKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLSDCTNVTFSKVQ 
RFWESNSAAHKE I CAVLAAVTEVI RS QGGKETETE YFAAL I RKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTLHMLTLLKDLLPCFPEGLVKSCSETLLRVMTLSHVLVTA 
CAMQAFHSLFHARPGLSTLSJ^ELNAQIITALYDYVPSENDLQPL 
IAWLKVMEKAHINLVRLQWDIXLGHLPRFFGTAWCLLSPHSQV 
LTAATQSLKEILKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCLQSLCDL 
RLS PHFPHTAALDQAVGAAVTSMGPEVVLQAVPLE I DGSEETLD 
FPRSWLLPVIRDHVQETRLGFFTTYFIiPLANTLKSKAMDLAQAG 
STVESKI YDTLQWQMWTLLPGFCTRPTDVAI S FKGLARTLGMAI 
SERPDLRVTVCQALRTL ITKGGQAEADRAEVSRFAKNFLP I LFN 
LYGQ P VAAGDTP APRRA VLE TIRTYLT I TDTQLVNS LLE KAS EK 
VLDPASSDFTRLSVLDLVVAIAPCADEAAISKIjYSTIRPYLESK 
AHGVQKKAYRVLEEVCAS PQGPGALFVQSHLEDLKKTLLDSLRS 
TSSPAKRPRLKCLLHIVRKLSAEHKEFITALIPEVILCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYIjVLIYPGIiVGAVT 
MVSCS I LALTHLL FEFKGLMGTSTVE QLLENVCLLIiASRTRD W 
KSALGFIKVAVTVMDVAHLAKHVQLVNEAIGKLSDDMRRHFRMK 
LRNLFT\ KFI PK\ FGILTWGKKAVGP KEYHRVLVNI RKAEARAK 
RHRALSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPGPGRGRKKDHS FKVSADGRLi IREEADGNKMEEEEGAKGED 
EEMADPMEDVIIRNKKHQKLKHQKEAEEEELEIPPQYQAGGSGI 
HRPVAKKAMPGAE YKAKKAKGDVKKKGRPDP YAY I PLNRS KLNR 
RKKMKLQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAI LLPLWRRTRPREATVPRGAAQRGRARSAEGRI PSSQS PS 
PAEAGGATRSPP PRPPRPARP PGPSAP PLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
AITLESPDIKYPLRLIDREIISHDTRRFRFALPSPQHIIiGLPVG 
QHIYLSARIDGNLWRPYTPISSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDT1 E FRGPSGLLVYQGKGKFAIRPDKK 
SNPIIRTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 

FVNEEMIRDHLPPPE\EEPLVLMCGPPPMIQYACLPNL\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGRPAPGVDAMVLCPVI GKLLHKR WLAS A 
S PRRQE ILSNAGLRFE WPS KFKEKLDKAS FATP YG YAMETAKQ 
KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSEFYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFEDL 
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SEQ 
ID 
NO: 

i 


Predicted 

hpcr ■{ nni ncr 

nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

XlUUXcULlUC 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
I A»/Uaiiin6 r L-tysteiiie, ij=/iSparcxc Acxa, us ,. 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
PsProline, Q=Glutaraine, R=Arginine, 
S=Serine, T« Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown , *=Stop 
Codon, /»ppssible nucleotide deletion, 
\=possible nucleotide insertion) 








oU VCAa^ivjOEir' I vKJJAvaoKJUli JSAr^vjclAoUAi AilLAc»LInK 1 Kb 1 L»P 
PFPTRLLEIiIEGFMLSKGIjIiTACKIiKVFDLLKDEAPQKAADIAS 
KVDAS ACGMERLLD I CAAMGLLEKTEQGY SNTETANVYLASDGE 
YSLHGFI MHNNDLTWNLFTYLE FAIREGTNQHHRALGKKAEDLF 
QDAYYQS PETRLRFMRAMHGMTKLTACQVATAFNLSRFS SACDV 
GGCTGALARELAREYPRMQVTVFDLPDI IELAAHFQPPGPQAVQ 
IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 
i\.rijAVj.LiJj Jj vl*4 1 LiLtDbc, l\K VAy KAljMUSIjiWijVQTEGlUsRSLGEY 
QCLLELHGFHQVQVVHIiGGVLDAIIi\PPKWPPEAQAACSL 


6974 


30B2 


2172 


RSCAAFASFASRPPLELFAPPGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 

TSLLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH. 
LTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPQAPASSPSSL 
STS PPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEES S SDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


RPRPTVHCCKWALKLETAMETLINVFHAHSGKEGDKYKLSKKEIi 

SLPPAPQPPPYL*LSAVPFPIHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQEYVVLVAALTVACb3NFFWENS 


6976 

* 


1215 


970 

■ 


GCQL* VAYGTTENSPVTFAHFPEDTVEQKAESVGR IMPHTEARI 

MNMEAGTLAKLOTPGELCIRGYCVMIXaYWGEPQKTEEAVD 

YWTGDVATMNEQGFCKIVGRSKDMIIRGGENIYPAELEDFFHTH 

I SHFKI PKY I VFVTNY PLTI SGKI QKFKLREQMERHt»NL * IKQQ 


6977 


1298 

» 

i -: • ? 


588 


SLFINTNLIjSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R 
ANKKS KHHVNGNRTVE PFPEX3TQMAVPGMGCFWGAERKFWVLKG- 
VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEVVRVVYQPEHMSFE 
ELLCTFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS ; 
KENYQKVLS EHGFGP I TTDIREGQTF YYAEDYHQQYLS KNPNG Y . 
CGLGGTGVS CP VGIKK 


6978. 


3 


242 


SFP FRDSRRCGGCKGSSLRHTAVAMVKLSKEAKQRLQQLFKGSQ - 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVLSLLWG 


6979 


3917 


1146 


DEARVRGEAVAAAILSRCRHWSGPPPFPPSPPDRKGLRGTEPWE. 

AGPGSGAT PGARAMDVRRLKVNELREEIiQRRGLDTRGLKTELAE * 

RLQAAIiEAEEPDDERELDADDEPGRPGHINEEVETEGGSELEGT. 

AQPPPPGLQPHAEPGGYSGPDGHYAMDWITRQNQFYDTQVIKQE 

NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFLPPEASQLKP 

DRQQFQSRKRPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 

OTLVAIDTYNCDLHFKVARDRSSGYPLTIEGFAYLWSGARASYG 

VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 

GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECGN 

DVELS FTKNGKWMGIAFR IQKEA1XGJQALYPHVL VKWCAVEFNF 

GQRAEPYCSVLPGFTFIQHLPLSERIRGTVGPKSKAECEILMMV 

GLPAAGKTTWAI KHAASNPS KKYNI LGTNAIMDKMRVMGLRRQR 

RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 
EMKANFTLPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 
PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 
NQYQQYAQQWNQ YYQNQGQW P P YYGNYP YGS YSGNTQGGTSTQ 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSLSDPGIjGYHPT 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

aiixixc , ^LyaLcifle, JJ=ASpcirxiC AClu f JE= 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

L— Leucine . MssMethiorH no N— Acnararr*! Ttc* 

P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine r V=Valine, 
WaTryptophan, Y«Tyrosine, X=Unknown, +=Stop 
Codon, /«possible nucleotide deletion, 
\~possible nucleotide insertion) 








CWTLRW P PLCS LHALHVFHCL FSS RLGTP VS PRLAMD PNCS CEA 
GG S CACAGS CKCKKCKCTS CKKSCCS CCPLGCAKCAQGCI C KGA 
SEKCSCCA 


6981 


10 


I 1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQS LRKLKQRFP SQSQATLLS I FSQE YQKHI KRTHAKHHTSEAI 
ESYYQRYIxNGWKNGAAPVLl^LAN^ 

EHEETPPSKSI INSMLRX1PSQI PDGVXxANQVYQCXVNDCCYGPL 
VDCIKHAIGHEHEVLLRDLLLEKNLSFIJDEDQLRAKGYDKTPDF 
ILQVPVAVEGHI IHWIESKASFGDECSHHAYLHDQFWSYWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6982 

• 


153 


1 1285 

J ** W w 

1 • 


r * \L VUL o Afrt/vf tjJj/Wa c*ir KKijKA I KKRRQRARGLKRVAWLiAP P 

PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVIKAFLCGS I SGTCSTLLFQPLDLLKTRLQTLQ 
PSDHGSRRVGMLAVIiLKVVRTESLIjGLWKGMS PS IVRCVPGVG I 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRS VAGVCMS P ITVI 
KTRYESGKYGYES I YAALRS I YHSEGHRGLFS GLTATLLRDAP F 

QPADVIKTHMQLYPLKFQWIGQAVTLI FKDYGLRGFFQGGI PRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMSFLQDPS FFTMGMWS IGAGALGAAALALLLANTDVFLS KPQK 

AALEYl^D IDLKTLEKE PRTFKAKELWEKNGAVIMAVRRPGCFL 
CREEAADIiSS TifC ^MTiDnT JH7DT.V A Vinnru TDTtn rtrnt?ftnvt>irnn 

^"•^J-«rt*"»J^iJiJO jJXVjluJJI^yjJO V C JLllAV V P> n. H 1 K 1 Yt V K 1 1H yjf X E K.\jE 

IFIJDEKKKFYGPQRRKMMF^FIRLGVWYNFFRAWNGGFSGNLE 

GEGFILGGVFWGSGKQGrLLEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGRSAYSLPAGSLPRVPATAAAKMASGVQVADEVCRIFYDMKVR 

. DP FKHFVGMLPEKDCRYALYDAS FETKESRKEELMFFLWAPELA 
PLKSKMIYAS S KDAI KKKFQG I KHECQANGPEDLNRAC I AE KLG 
GSLIVAFEGCPV 


6985 


1887 j 
1 


1324 


RRTAGI YPCF PKPGRTRHALC^VVliLLLTGQLAFDDFQES CAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 

IPVSGSKSEGLLYVHSSRGGPL^RWHLDEVFLELKDGQQIPVFK . 
LSGENGDEVKKE 


6986 ' 


•' 642" 


1350 


YHLYFKMGDPNSRKKQAIJJ^RAQLRKKKES 

WKEKKKKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 

ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 

LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 

VlWPNQSVFLFIDRQHl^TPKNKATIFKLCSICLYLPQEQLTHW 
AVGT I EDHLRP YMPE 


6987 


1623 | 


341 


LEAAE KAS RAFKESQRQTDS KNYETENWS PQKSQRR YDMYNTAC 
FLGEIEVGLYTIOTLnT J T 1 PPPWKTWPT.QTrK'H'MT7nT?T cr»in»TPTTin 

DPRNECYLALSKFTSHLKl^SDLI^^ 

KEIAEIMLSKKVSRCFRKYTELFCHLDPCLLQSKESQLLQEENC 
RKKLEALRADRFAGLLE YLNPOTKDATTME S I VNE YAFLLQQNS 
KKPMTNE KQNS I LANI ILS CLKPNS KL IQ PLTTLKKQLRE VLQF 
VGI^HQYPGPYFl^CLLFWPENQEl^QDSKLIEKWSSLNRSFR 
GQYKRMCRS KQASTLFYLGKRKGLNS I VHKAKIEQY FDKAQNTN 
SLWHSGDVWiO<NEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNIERVSFYLGFSIEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPQPMPRADCIMRHLPYFCRGQWRG 
FGRGS KQLG I PTANF PEQ WDNLPAD I S TG I YYGWAS VGSGDVH 
KMWS I GWNP YY KNTKKSMETH I MHTFKEDF YGE I LNVA I VGYL 
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SEQ 
ID 
NO: 


Predicted 
beciinnincf 
nucleotide 
location 
corresponding 
to first ' 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

m, m V*>^^ ^» ^^^^ W "™ ^^^^ 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cvsteine. D=«Asi>artic Acid. E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine # I=Isoleucine, K=Lvsine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q =Glut amine , R=Arginine, 
S=Serine, T^Threonine , t VsValine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIEEAKKRLELPEHliKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


"in 8 


LM PS DRP LS PS THAS AGSHCHAP PTTAR RAF P T P VC1 Q If Q MM n TT . 

u it sua* ijo t j i imuAuunwufu r i i ^uxxvrvr trie f v?o xvOlAJl'Jx^X l_i 

KDQLIYNLLKEEQTPQNECITWGVGAVGMACAISILMKDLADEL 
ALVDVIEDKLKGEMMDLQHGSLFLRTPKIVSGKDYNVTANSKLV 
1 1 TAGARQQEGESRLNLVQRNVNI FKFI I PNWKYSPNCKLLI V 
SNPVDI LT YVAWK I SGF PKNRVIGSGCNLDS ARFRYTiMf4FT?T/?V 
HPLS CHG WVLGEHGDS S VPVWSGMNVAG VS L KTLHPDLGTDKD K 
EQWKEVHKQVVESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
RVHPVS TM I KGL YG I KDDVFLS VP C I LGONG I S DL VKVTIjTS EE 
EARLKKS ADTLWG I QKELQF 


6990 

« 


719 


258 


THASGMAS VVLALRTRTAVTS LLS PTPATALAVRYASKKSGGS S 
KNLGGKS SGRRQG I KKMEGHYVHAGNI I ATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNKRRNPKKIAYLL 
SSLLMTNLNPNESTENQPVDAYV/AFTLDOEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCSS LALRQVRQVYCGLVRAPQVQTRPLS SRFVERRGALY 
RSPMNQENPPPYPGPGPTAPYPPYPPQPMGPGPMGGPYPPPQGY. 
PYQGYPQYGWQGGPQEPPKTTVYVVEDQRRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


OWCVTCPOHNAROGPAVPPGTOAYGAAPPEDI.OVDPTFM Q IfPRfJ 
DRVWIKNWOTASLCPLWKGPQTVVLSPPTAVKVEGI PAWIHHSH 
VKPAARETWEARPS PDNPFRVTLKKTTSPAPVTPGS 


6994 


346 

*# 

'» 


1100 

• 

♦ * • 


QWPE KDP VMAAS SISS P WGKHVFKAI LMVLVAL I UjHSALAQSR 
RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHIiVSES 
SSQVLWAI SSAISVAF FALSG IAAQLLNALGIiAGDYLAQGLKLtS 
PGQVQTFLLWGAGALWYWLLSLLLGLVIiALLGRI LWGLKLVI F 
LAGFVALMRSVPDPSTRALIjLLALIjILYALLSRLTGSRASGAQIj 
EAKVRGLERQVEELRWRQRRAAKGARSVEBE 


6995 

> 
• 


144 


1346 


GSVAVGLSGIMAAQKDLWDAIVIGAGIQGCFTAYHLAKHRKRIL 
Ti LEO PPXi PHS RGS QRGOfi R T T R Tf A VT • Pn PVTP MMPTPf^Vri T Man T» 

EHEAG1X3LHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 

EELKQRFPNI RLPRGEVGLDDNSGGVI YAYKALRAxjQDAIRQLG ' 

GIVRDGEKWEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQLL 

RPLGIEMPLQTLRIOTCYWREMVPGSYGVSQAFPCFLWLGLCPH 

HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 

SSFVRDHLPDLKPEPAVIESCMYTNTPDEQFILDRHPKYDNIVI 

GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 

KAHXi 


6996 


543 


1942 

* 


ETANAEAAARKSAMDWKEVLRRRLATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKIiR 
EESRAVFLORKSRELLDNEELONLWFLLDKHOTPPMIGEEAMIN 
YENFLKVGEKAGAKCKQFFTAK\^AKLLHTDSYGRIS IMQFFNY 
VMJIKVWLHQTRIGLSLYDVAGQGYLRESDLENYI LELI PTLPQL 
DGLEKS FYS FYVCTAVRKFFFFLDPLRTGKI KI QD I LACS FLDD 
LLELRDEELSKESQETOWFSAPSALRVYGQYLNLDKDHNGMLSK 
EELSRYGTATMTm^LDRVFQECLTYDGEmYlOTLDFVIiALEK 
RKEPAAMYIFiai^IENKGYl^FSLira^FP^IQELMKIHGQD 
PVSFQDVKDEIFDMVKPKDPLKISLQDLINSNQGDTVTTILIDL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFILRLAIYILTFPliYLLNFLGLWSWICKKWFPYFLVRF 
TVIYl^QMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFVVAAGENMH 
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SEQ 
ID 
NO: 


Predicted 
becriimincr 

nucleotide 
location 
corresoondincr 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

m \c~\ pot" i rip* 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine . M=Methionine . M=Asnaraciirip 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *«Stop 
Codon, /=possible nucleotide deletion, 
Vouossiiale nucleotide insertion) 








QVADGS VD WVCTLVLCSVKNQER I LRE VCRVLRPGGAF Y FMEH 
VAAECSTWNYFWQQVLDPAWHLLFDGCNLTRESWKALERAS FSK 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHLPVPYAPPTMESRGKSASSPKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQPPAAPTTAPAKKTSAKADPALLNNHSN 
LKPAPTVPS SPDATPE PKGPGDGAEEDEAASGGPGGRGPWS CEN 
FNPLLVAGGVAVAAI AL I LGVAFLVRKK 


6999 


14 

■ 


1591 


GRAGACSRRDTAMS IE I ES SDVIRLIMQYLKENSLHRALATLQE 
ETTVSLNTVDS I ES FVADIWSGHWDTVLQAIQSLKIiPDKTL I DL 

YEQVVLELIELRELGAARSLLRQTDPMIMLKQTQPERYIHLENL 

t a'DCVTTTi'D'DTJii.VDTV^ccTrpv'Doaa Tar>7VT an woinrDD cdt mat 
urtKo I r UtrtxaAl ItLAjoo ft. is ft k KAti I AUAl iAl-iK vSV V i^i^oKJjr^LftJj 

LGQALKWQQHQGLLPPGMTIDLFRGKAAVKDVEEEKFPTQLSRH 
IKFGQKSHVECARFSPDGQYL\rrGSVDGFIEVWNFTTGKIRKDIj 
KYQAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKI KVWKIQSG 
QCLRRFERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKE FRGHS S FVNE AT FTQDGHY IISASS DGTVKI WNMKTTE CS 
NTFKSLGSTAGTDITVNSVILLPKNPEHFWCNRSNIWIMNMQ 

TGKLERTLTVHEKDVIG IAHHPHQNLIAT YSEDGLLKLWKP 


7000 


: 2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLIiQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEED IVGRN 
LLYAACMAGQSDVI RALAKYGVNLNE KTTRGYTLLHCAAAWGRL 
ETLKALVELDVDI EALNFREERARDVAAR YSQTECVEFLDWADA 
RLTLKKY I AKVSLAVTDTEKGSGKLLKEDKNT ILSACRAKNEWL 
ETHTEASINELFEQRQQLEDIVTPIFTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 

si- 


2056 

- 

k a 


844 


RRCLIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRLIAN 

TSPKRLCIRPSEPVDAVVWSVKHDPLPIiLPEANGHRSTNSPTI 
VSPAIVSPTQDSRPNMSRPLITRSPASPWQGIPTPAQLT-KSN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 

HYFIDRDGQMFRYILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
nPMT.T.FMFRWTfnnPTrTYrjBPcpDnppT.vvpvADnT m?PTTT cmw 

UxrPixjxjJLv'lJLKVV iSA^UrCCi x ux%r oKr V V it vi-iir JJXj\j£irCX X IjouJJlS. 

SLIEEWPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFEIVGSCX3GGVDSSQFSEYVLRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 

♦ 


498 


PMPS STRWTTS * T YTDTSS AWACRFTTGTCT * TAAPG PTVR WWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGPRTGTATSGTATTTSWPGCGTRMWSTOWSSV 

PRSRSCCSRPATT P PSKPGAPHAPCAS SRHLAHGLAPS S PGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRLSALLAIaASKVTLPPHYRYGMSPP 
GS VADKRKNPPWI RRRPVWEP I SDEDWYLFCGDTVE I LEGKDA 
GKQGKVVQVIRQRI^WWGGLNTHYRYIGKT^YRGTMIPSEAP 
LLHRQVKLVDPMDRKPTEI E WRFTEAGERVRVSTRSGRI I PKPE 
FPRADGIVPETWIDGPBQOTSVEDAI^RTYVPCLKTLQEEIVMEAM 
GIKETR\NTRRSIGIEPGAEQLLPNFCPSLEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
GXPKRTLICTQIiG/YYCRVRPLGFPDQECCIEVIlWlTVQLHTPE 
GYRLNRNGDYKETQ YS FKQVFGTHTTQKELFDWANPLVNDL IH 
GKNGLLFTYGVTGSGKTHTMTGS PGEGGLLPRCLDMI FNS IGSF 
QAKRYVFKSNDRNSMD IQCE VDALLERQKREAMPNP KTSS S KRQ 
VDPEFADM I TVQE FCKAEE VDEDS VYG VFVS Y IE I YNNY I YDLL 
EEVPFD P INPNLHNLNCFVK I K2JHNMYVAGCTBVEVKS TEEAFE 
VFWRGQKKRRIANTHLNRESSRSHS VFNI KLVQAP LD ADG DNVL 
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SEQ 
ID 
NO: 

■ 
• 


Predicted j 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine,. 
H=His.tidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, . 
W-Tryptophan, Y«-Tyrosine, X«Unknown, *»Stop 
Codon, /opossible. nucleotide deletion, 
\=possible nucleotide insertion) 


• 


* 




QEKKQITISQLSLVPIiAGSERTNRTRAEGNRIjREAGNINQSLMT 
LRTCMDVLRENQMYGTNKMVPYRDSKliTHLFKNYFDGEGKVRMI 
VCVNPKAEDYEENLQVMRFAEVTQEVEVARPVDKAICGIiTPGRR 
YRNQPRG P \ IGNE PLVTDWLQS F P PLPS CE I LDINDEQTLPRL 
IEALEKRHNLRQMMIDEFNKQSNAFKALLQEFDNAVLSKENHMQ 
GKLNEKEKMISGQKLEIERLEKKNKTLEYKIEILEKTTTIYEBD 
KRNLQQELETQNQKI^RQFSDKRRLEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERLGLFEEL 
WAAQVKRLASMAQKEPRTI KI SLPGGQKIDAVAWNTTP YQLARQ 
J S3 TLADTAVAAQVNGE P YDLERPLETDSDLR FLTFDS PEGKAV 
FraSSTHVLGAAAEQFIiGAVLCRGPSTEYGFYHDFFLGKERTIR 
GSELP VLERI CQELTAAARPFRRLEASRDQLRQLFKDNP FKLHL 
' IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGLKLLSNSSS 
LWRSSG 


7006 


22 


898 

* 

> 


NAFGRHSTAVKMAAAAWLQVLPVI LLLLGAHPS PLS FFSAGPAT 

VAAADRSKWHIPIPSGKNYFSFGKILFRNTTIFLKFDGEPCDLS 

LNITWYLKSADCYNEIYNFKAEEVELYLEKLKEKRGLSGKYQTS 

SKLFQNCSELFKTQTFSGDFMHRLPLLGEKQEAKENGTNLTFIG. 

DKTAMHEPLQTWQDAPYrFIVHIGISSSKESSKENSLSNLFTMT 

VEVKGPYEYLTIjEDYPLMIFFMVMCIVYVLFGVLWIiAWSACYWR 

DLIiRIQFWIGAVIFLGMLEKAVFYAGFQ 


7007 


2 


1001 

t 


AMTVSGPGrPEPRPATPGASSVEQLRKEGNELFKCGDYGGALAA 
YTQALGLDATPQDQAVLHRNRAACHLKLEDYDKAETEASKAIEK 
DGGDVKALYRRSQALEKLGRLDQAYLDLQRCVSLEPKNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKGTEKKQKAS. 
QNL WIiAREDAGAE KI FRSNGVQLLQRLLDMGETDLMLAALRTIi 
VGICSEHQSRTVATLS ILGTRRWS ILGVESQAVSLAACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKQWGLLDVTVMEGMGIiSQ 
PGQFFGDQTCSCRLFGI RFGDI I LL 


7008 

r 
1 


70 

• 


1478 

• 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR 

SPPPLAGPGQKMVQKKPAELQGFHRSFKGQNPFEIiAFSLDQPDH 

GDSDFGLQCSARPDMPASQP I D I PDAKKRGKKKKRGRATDS FSG 

RFEDVYQLQEDVLGEGAHARVQTC INL I TSQEYAVKI IEKQPGH . 

IRSRVFREVEMLYQCQGHRNVLELIEFFEEEDRFYLVFEKMRGG 

S ILSHIHKRRHFNELEAS WVQDVASALDFliHNKG I AHRDLKPE 

NILCEHPNQVSPVKICDFDLGSGIKIjNGDCSPISTPELLTPCGS 

AEYMAPEWEAFSEEAS I YDKRCDLWSLGVILYILLSGYPPFVG 

RCGSDCGWDRGEACPACQNMLFESIQEGKYEFPDKDWAHISCAA 

KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 

WDSHFLLPPHPCRIHVRPGGLVRTVTVNE 


• 7009 


1 

• 


626 


ARQLRNSWVDDFVAAPLI PLSQQI PTGNSLYES YYKQVDPAYTG 
RVGASEAALFLKKSGLSDIILGKIWDLADPEGKGFLDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMPPPKFHDTSSPLMVTPPSAEAH 
WAVRVEEKAKFIX3I FESLLPINGLLSGDKVKPVLMNSKLPLDVL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAW PETLLS PIiCPLLGGGTAMSGGEQKPEXx XVC? vU vb i 
GS VRAALVDQSGVLLAFADQP IKNWEPQFNHHEQSSEDI WAACC 
WTKKWQG IDLNQ I RGLGFD ATCS L WLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


RIQTLPNQNQSQTQPLLKTPPAVLQPIAPQTTFGVQTQPQPQSL 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 

RKRSRERS PRRERERS PRRVRRWPRYTVQFS KFSLDCP S CDMM 
ELRRRYQNLYIPSDFFDAQFTWTOAFPLSRPFQLGNYCNFYVMH 
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SEQ 

JIU 

NO: 


Predicted 

Vn q *-f ^ TIT1 "4 V^/T 

JJfciy iiiiixiiy 

nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

L. yJ L, IX O U 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
jj— ijcucine, M^eLfiionine , N=Asparagxne , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 




* 




REVESLEKNMAILDPPDADHLYSAKVMLM^^ 

AEDPQELRDGFQHPARIjVKFIiVGMKGKDEAMAIGGHWSPSLDGP 

DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGS ENGS EVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI IIGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATFRKGNYVADLGAMVVTGIX^NPMAWSKQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP, 
VS LGQALE WIQLQEKHVKDEQI EHWKKI VKTQEELKELLNKMV 
NLKE KI KELHQQ YKEAS EVKP PRD I TAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTliSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDI KLNTAVRQVRYTASGCEJVIAVNTRSTSQTFI YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKVVLCFDRV 
FWDPSVNLFGHVGSTTASRGELFLFWNLYKAPI LLALVAGEAAG 
IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 
* RG S YS YVAAGSSGNDYDLMAQPITPGPSIPGAPQ PI PRLFFAGE 
HTIRNYPATVHGALLSGLREAGR I ADQFLGAM YTLPRQATPGVP 
AQQSPSM 


7013 


1 

* 


2661 

* 

• 


R^GSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGS EVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
KAICAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 

\ DRMTSQEAACFPDI ISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGI YKR IKPL 
PTKKTGKVI 1 1GSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATFRKGOTVADLGAMVVTGLGGNPMAVVSKQVNMELAKIKQKCP 

' LYEANGQAVPKEKDEMVEQE FNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVTOEQIEHWKKIVKTQEELKELLNKMV 
NLKE KI KELHQQ YKEAS EVKP PRD I TAE FLVKS KHRDLTAL CKE 
YDEIaAETQGKLEEKLQELEANPPSDVYLSSRDRQI 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYS CVPVALAEG 
LD I KLNTAVRQVRYTASGCEVI AVNTRSTSQTFI YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPSVNLFGHVGSTTASRGELFLFWNLYKAPILLALVAGEAAG 
IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 
RGS YS YVAAGSSGNDYDLMAQPITPGPS I PGAPQPI PRLFFAGE 
HT I RNYPATVHGALLSGLREAGRIADQFLGAM YTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETOALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPB 

PIiATDSPTSDPTEWNGISSQPQVPFHPNLQKSQYYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDS KLTQQLI EFEKS LAGPGTEP 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNLKPAPPLWRPSRPAPLPPSAQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN" 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
•location 
cor re sponding 
;to first 
amino acid 
xesidue of 
amino acid 
[sequence" 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 

* 


Amino acid segment containing signal peptide 

Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glu t amine , R=Arginine, 
S=Serine, T=Threonine, . V= Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=«possible nucleotide insertion) 






| ■ 


MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
IKVSKQLLAALEISDAVGPVFLGHRDELEGTYKIYCQNHDEAIA 
LLE I YE KDEKI QKHLQDS LADLKS LYNEWGCTNYINLGS PL I KP 

VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEINVNINE 
YKRRKDLVLKYRKGDEDSLMEKISKLNIHSIIKKSNRVSSHLKH 
LTGFAPQIKDEVFEETEKNPRMQERLIKSFIRDLSLYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLVI SPLNQLLSMFTGPHKLVQKRFDKLLDF YNCTERAEKLK 
DKKTLEELQSARNNYEALNAQLLDELPKFHQYAQGLFTNCVHGY 

QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLLARYPPEKLFQAERNFNAAQDLDVSLLEGDLVGVIKKK 
DPMGS QNRWLI DNGVT KG FVYSS FLKP YNPRRSHSDAS VGS HSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRS YRNFRHPE I VGYSVPGRNGQSQDLVKG 
CARTAQAPEDRS TEPDGS EAEGNQVYFAVYTFKARNPNELS VSA 
NQKLKILEFKD VTGNTE WWIAEVNGKKGYVPSNY IRKTEYT 


7015 

t 


1842 

v - 1 

» J 


513 

* 


RQAWHE \VAAPS WRGARLVQSVLRVWQVGPHYARERVI PFSSLL 
GFQRRCVSCVAGSAFSGPRIASASRSNGQGSALDHFLGFSQPDS 

KSTLSNQLLGRKVFPVSRKVHTTRCQAIjGVITEKETQVILLDTP- 
GI IS PGKQKRHHLELSLLEDPWKSMESADLVWLVDVSDKWTRN 
QLSPQLLRC^TKYSQIPSVIiTONKVDCLKQKSVLLjSriTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMLSALSQEDVKTLKQYLLTQAQPGPWEYHSAVLTSQTPE 
EICANIIREKLLEHLPQEVPYNVQQKTAVWEEGPGGELVIQQKL 

LVPKESWKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


7016 


167 

* * 1 


2513 

■ 


I LNAP KPP PPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYLS 
LVARLI IHFRDI HNKKSQAS VSDPMNALQS LTGGPAAGAAG IGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSS S SSSRRRYSSS SSSSNS KQ 
FQAQQS AMQQ \Q FQA\ WQQQQQL\QQQQQQQQHL I KLHHQNQQ 
Q I QQQQQQLQR IAQLQIiQQQQQQQQQQQQQQQQALQAQPP IQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSQAQALPGQMLYTQPPLKFVRAPMVVQQPPVQP 
OVOOOOTAVOTAOAAOMVAPGVOV^nR Q T . PMT • «I P «; PnonvnTD 

QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QS PVTARTPQNFS VPS PGPLNTPVNPS S VMSPAGSSQAEEQQY 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCB IALEKLKNDMAVPTPP P PPVPPTKQQYLCQP 
LLDAVLANIRSPVFNHSLYRTFVPAMTAIHGPPITAPWCTRKR 
RLEDDERQS I PS VLQGEVARLDPKFLVNLDPSHCSNNGTVHL I C 
KLDDKDLPSVPPLEIiSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCMTSRIiLQLPDKHSVTALLNTWAQSVHQACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI*ALFMATDFRRQVLSLNLNGCNSLMKKLQHL 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLLDR 
LHEEEKILKVQASHKPSEILECSETSLQEVASKAAVLTETPRTS 
DGE KTLI EKMFGGKLRTH1RCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEYMSCPDCSQS PS IQDGGLMQASVPGPSEEPWYWPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKDVPQKPG 
GETT PSVTDLLNYFLAPE ILTGDNQ YYCENCASLQNAEKTMQI T 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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SEQ 

T*n 
xu 

NO: 


Predicted 

Dcy inning 

nucleotide 
location 
corre sponding 
to first 
amino acid 
residue ox 
amino acid 
sequence 


Predicted end' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ammo acid 
sequence 


Amino acid segment containing signal peptide 
(AaAianine, c=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, • 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine f RsArginine, 
S=Serine, T=Threonine , V=Valine, 
w^Tryptopnan, Y«Tyrosine, X=unJcnown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 

SVVVHSGISSESGHYYSYARNITSTDSSYQMYHQSEALALASSQ 

SHLLGRDSPSAVFEQDLENKEMSKEWFLF 

ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 

LMDAITKDNKLYLQEQELNARARALQAASASCSFRPNGFDDNDP 

PGSCGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEIIFQ 
PSLIGEEQAGIAETLQYILDRYPKDVQEMLVQNVFLTGGNTMYP 
GMKARMEKELLEMRPFRSSFQVQLASNPVLDAWYGARDWALNHL 
DDNEVWITRKEYEEKGGEYXiKEHCASNIYVPIRLPKQASRSSDA 
QASSKGSAAGGGGAGEQA 


7019 


1048 


335 


APGGFLVTMVFPAPSPPWMLGCCSHEVTAGP PTLCKDMSALVAA 
RMRHIPIAPGSDWRDLPNIEWLSIXSimRKLRYTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTLI PWCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTNPEPMGKQGRVriHPEQHRWSVRECAR 
SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLEIKLCMLAK 
ARESASAKI KEEEAAKD 


7020 

• 


1 


2154 

• if 

* 

V 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRVR 
NGFLMRKVAVFFSNTPTRASPQLREAVLKLSDAGITPLFLTRQE 
DRQLINALQINNTAVGHALVLPAGRDLTDFLENVLTCHVCLDIC 
NI DPS CG FGSWRPS FRDRRAAGSDVDIDMAF ILDSAETTTLFQF 
NEMKKYIAYLVRQLDMS PDPKASQHFARVAWQHAPSES VDNAS 
MPPVKVEFSLTDYGSKEKLVDFLSRGMTQLQGTRALGSAIEYTI 
ENVFESAPNPRDLKI WLMLTGEVPEQQLEEAQRVI LQAKCKGY 
FFWLGIGRiCVNIKEVYTFASEPNDVFFKLVDKSTELNEEPLMR 
FGRLLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNLVKFGHKQ 
VNVPNNVTS S PTSNP VTTTKP WTTKP VTTTTKP VTTTTKP VT I 
INQPSVKPAAAKPAPAKPVAAKPVATKTATVRPPVAVKPATAAK 
PVAAKPAAVRP PAAAAAKPVATKPEVPRPQAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAVVCYLRSQVRATYHGSFS 
TKKSQPPPPQPARSASSSTINLMVSTEPIALTETDICKLPKDEG 
TCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


7021 


2 


338 


VNAVS FFPNGYAFATGS DDATCRLFDLRADQEI1L1LYSHDNI I CG 
ITSVAFSKSGRLLLiAGYDDFNCRvwDTIiKGD 
CLGVTDDGMAVATGSWDS FLRIWN 


7022 


2 

■ 


856 


VYIGSFWSHPLLIPDMRKLFEAEEQDLFRDIQSLPRNAALRKLN 
DLI KRARLAKVHAYI I S S LKKEMP S WGKDNKKKELVNNLAE I Y 
GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHDIAQLMVLVRQEESQRPIQMVKGGAFEGTLHGPFGHGYG 
hCiACjEGXUDAr.WVvAKUivrMzDaXr 1 1 IjorVJLKjK.1 IXiAlNAKraiM 
VRSKLPNSVLGKIWKLADIDKDGMLDDDEFALANHLIKVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 


2 


748 


AMVFGG WP YVPQ YRD I RRTQNADGFSTYVCLVLLVANI LRI LF 
WFGRRFES PLLWQS AIM I LTMLLMLKLCTEVRVANELNARRRS F 
TAADS KDEEVKVAPRRS FLDFDPHHFWQWSS FSDYVQCVLAFTG 
VAfi Y T TYTjS T D <? AJj FVETLG Fr iA VTiTEAMLGVPOL YRNHRHOS T 

EGMS I KMVLMWTSGDAFKTAYFLLKGAPIjQFSVCGLLQVLVDIiA 
I LGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 
S SRRGS PGTVLGLPFWLLTP VLVS RS I RSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEAIiRQAAVGQGD 
FHLLDHRGRARCKADFRGQVmjMYFGFTHCPDICPDELEKLVQV 
VRQLEAEPGLPPVQPVFI TVDPERDDVEAMARYVQDFHPRLLGL 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of ' 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, Ess 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








TGSTKQVAQASHS YRVYYNAGPKDEDQDYI VDHS IAI YLLNPDG 
LFTDYYGRSRSAEQI SDSVRRHMAAFRSVLS 


7025 


232 


832 


ERNSPIGNNENL*K\HSIiDCLCFRGDWEGNTQPQTLQDNQEECF 
KQVI RTCE KRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAFI SG 
SDHTQHQLIHTSEKFCGDKECGNTFLPDSEVIQYQTVHTVKKTY 
ECKECGKS FSLRSSLTGHKRI HTGEKP FKCKDCGKAFRFHSQLS 
VHKRIHTGEKS YECKECGKAFS CG 


7026 


328 


1146 


NPNPS IGDI KDIKKAAKSMLDPAHKSHFHPVTPSLVFLCFIFDG 
LHQALLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
SSVLRHCCDLLIGVAAGSSDKICTSSLQVQRRFKAMMASIGRLS 
HGESADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKS S *LPRKHR*QP INAVRMFLDQCMDGS IAIiRAIVSE I PVFE 

EKKJWG*KGIGEIF*WGCTLPPHYWGAVTTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 

* 


43 


* 954 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKLFEH 
YYQELKIVPEGEWGQFMDALREPLPATLRITGYKSHAKEILHCL 
KNKYFKELEDLEMDGQKVEVPQPLSWYPEELAWHTNLSRKIIiRK 
S PHLEKFHQFLVSETESGNISRQEAVSMI P PLLLNVRPHHKILD 
MCAAPGSKTTQL I EMLHADMNVPFPEGFVIANDVDNKRC YLLVH 
QAKRLSSPCIMWNHDASSIPRLQIDVDGRKEILFYDRILCDVP 
CSGDGTMRKNIDVWKKWTTLNSLQLHGLQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSRNFIRNS 

KKMQSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 

EILLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 

* 

• 

* 


40 


VLESNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQLRIjWG 
/ PCPHAGRETGPRASAP IPGS *GHGWHW*RKDGRGERSEGPSAL 
SPHS PSLLNMOQAPTHVG PGMGSOR PRS SVVPEOVGVGSOTjS R F 

RWRA* RSLPGAAASERTEMTKERS P/RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGiSGCLLYPIjQS iMPE*QLR*GAIiASPPTQG 
R*GKGGPRSPLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS\ 
AAPGQKRGHREA* QGPEPV/ WGRVTTHLQGPAG* TKPLGS \RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFSPQLS IPMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 
PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 

* 


521 

9 


FVCFSAPGSGQGGKRRVNMELSAVGERVFAAEALLKRRIRKGRM 
E YLVKWKG WSQKYS TWEPE ENI LDARLLAAFEEREREMEIiYGP K 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIRIPYPGRSPQDL 
ASTSRAREGLRN \RVCPRQRAAPAPAAP \ PRRGP SGPGPRPG* G 
PGLHFPGPGGPSKHGFVPASEQHQHQQHLPRRGPSGPGPRPG 


7031 


960 

■ 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS/RHCDELHEGPSRTAALPCXSKPQPKHGVEECG/PCPCLA 
PRRLTE PPALTVSP VGRAAPSGAL* PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 
LGAQARAAPPRLWCPRALVSG*EASPEAVSVAAGPPVPGPTPST 
SGSTASHSRRGC*SPR*TPAPPRRDHGRSAAFEVLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRLHSLFWKSWQKMNSFXLTPiOiDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 



589 



WO 01/53312 



PCTYUS00/34263 



SEQ 
ID 
NO: 


Predicted 
Jbecr i rm i ncr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

niin 1 pnl" "l /^o 

location 
corresponding 
to first 
amino acid 
residue of 

ami rn 

sequence 


Amino acid segment containing signal peptide 
i a— Alanine, c= cysteine , D»Aspartic Acid, e= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V* Valine, 
w= lrypcopnan, x = ryrosine, X=Unk.nown, *=S top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT*SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLTLIDALD 
TLL\TLFYFQILGNVSEFQRWEVLQDSVDFDIDVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEBAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETPVTCTAGIGTF I VEFATLS SL 
TCDPVFEDVARVAIWRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
^WVQMYKGOTSMPVFQSLEAYWPGLQSLIGDIDNAMRTFLNYY 
TVWKQFGGLPEFYNIPQGYTVEKREGYPIiRPELIESAMYLYRAT 
GDPTLLELGRDAVES I E KIS KVECGFATI KDLRDHKLDNRMES F 
FIAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFFI F I FLRLNYNKLLLAI I KK 
K 


7035 

i 


92 

■ 


1942 

* ** * 

• 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 

KAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLTLIDALD 

TLL\TLF YFQ I LGNVSE FQR WEVLQDSVDFDIDVNAS VFETNI 

RWGGLLS AHLLS KKAGVEVEAGW PCSGPLLRMAEEAARKLLPA 

FQTPTGMPYGTVNLLHGWPGETPVTCTAGIGTFIVEFATLSSL 

TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 

IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW- 

^WVQMYKGTVSMPVFQSL^YWPGLQSLIGDIDNAMRTFIjNYY 

TVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRAT 

GDPTLLELGRDAVE S I E KI S KVE OG FAT I KDLRDHKLDNRMES F 

FIiAETVKYLYLLFDPTNFZHKWGSTFDAVirPYGECILGAGGYI 

FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ . 

jsjn I v £> £>t»FW.b.fc' VAKPL* JLJjJb SPKNHIJQARERKPAKQKVPLLSCPS 

QPFTSKLALLGQVFLDSS * PLDNFFI F I FLRLNYNKLLLAI I KK . 

K 


7036 


442 

• 


751 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 

RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


'Di 


CijAjJlii< Ul INLHJjAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 

• 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEI IL 
Q YNKLLEKSDLHS VLAQKLQAEKHDVPNRHEI SPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
lix cJiM AHUJuy i JLoDJ-ic* i klIjUIjKI KLuoiibKANQTLKDEYPA 
LQI TFTALEGKLRKTTEENQBLVTRWMAEKAQEANRLNARE * KR 
LQEAAS PAAERACRS SKGTS TSRTG 


7039 


155 

< 


891 


GAGAASDMSSGLRAADFPRWKRHI SEQLRRRDRLQRQAFEEI I L 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHEISPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQ I TFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTS RTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, X=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=s Proline, Q=Glutamine, R=Arginine, 
S= Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X*= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 




• 




PGSQRRRLI PALSLDTSS PVRKPPNSTGVRW VDGPLRS S PRGLG 
EPFEIKVYEIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERLESRVNFCKAHLMMITCFD IT 


7041 


1 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYDWGRLNLQS VTEQSSLDDFLATAELAGTEFVAEKLNI KFV 
PAEARTGLLSFEESQRIKKLHEENKQFLCIPRRPNWNQNTTPEE 
LKQAEKDNFLEWRRQL\VRLEEEQKLILTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


345 


P IHMAAAALRAD I \ I SPLFPH I QGYLLLS ASHG \ ATSLHTKGAL 
PLETVTM YTVI P KS KYVLVKPDTQYP YS ENLDEFKRLAENS ASN 
DDLLMAEVAI SDYGDKLTLELREKY 


7043 

* 

• 

- 


2 

- ! * 


2170 

- 


ARGMAARDSDSEEDLVSYGTGLEPI^EGERPKKPIPLQDQTVRD: 

EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD ; 

KSVLGPEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 

QLAAATAP I PGATLLDDLITPAKLS VGFELLRKMGWKEGQGVGP 

RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 

APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 

SERAGDLGEIGLNKGRKIiGISGQAFGVGALEEEDDDIYATETLS 

KYDTVLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 

SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQy 

LSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 

LSQKDKERIKEMKQATDLKAAQLKARSLAQNAQSSRAQPS PAAA 

AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 

GQKDALERCLDPSMTEWERGRBRDEFARAALLYASSHSTLSSRF 

THAKEEDBSDQVEVPRDQENDVGDKQSAVKMKMFGKLTRDTFEW 

HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 

KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 

QQSSPLVNKEEEHAPELSAN 


7044 

*> 


276 


734 


EVYLTDEFAKGRKVADLYELVQYAGNIIPRLYLLITVGVVYVKS : 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG .' 
EPTDEETTGDISDSMDFVLLWFAEMNKLWVRMQHQGHSRDREKR 

* 

ERERQELRILVGTNLVRLSQV 


7045 , 
i 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP • 
KANQWEKTDIEGTLFVYRRSASPYHGFTIWRIjNMHNLVEPVNK . 
DLEFQLHEPFLIjYRNASLS I YSIWFYDKNDCHRIAKLMADWEE ■ 

etrrsqqa/rsgqtesqpgqwlqrpqahrhpgdaeqsqg 


7046 


3 


513 


lgfkmealsragqemslaalkqhdpyitsiadltgqvalytfcp 
kanqwektdiegtlfvyrrsaspyhgftivnrlnmhnlvepvnk 
dlefqlhepfllyrnaslsiysiwfydkndchriaklmadvvee 
ETRRS qqa/ rsgqtesqpgqwlqrpqahrhpgdaeqsqg 


7047 


103 


486 


qmkiekcgwsegltsikgnchnfytaiskdvtykelknllnskn 
imlidvreiweileyqkipesinvpldevgealqmnprdfkeky 
nevkpsksds/ivfsyiagvrskkaldtaislgfhsyyer 


7048 


92 


627 


ffcltllsswdyrhhatrrvisspvftmedsgktfsseeeeany 
wkdiamtykqraentqeelrefqegsreyeaeletqlqqietrn 

ROLLS ENNRLRMELET I KEKFEVQHSEGYkQISALEDDIiAQTKA 
IKDQLQKYIREIiEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 ! 


393 


938 


krtgsasyggpppglggpatxasvagrcssvgkiparrcyedel 
vpvfeavgriyelrlmmdfdgknrgyafvmychkheakravrel 

NNYE I RPGRXiLGVCCSVDNCRLFIGGI PKMKKREEI LEE IAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7050 


393 


938 


KRTG S AS YGG P PPGLGG PATXAS VAG RCS S VGKI PARRC YE DEL 
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SEQ Predicted 
ID beginning 
NO: nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



7051 



119 



7052 



467 



7053 



Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



816 



467 



7054 



7055 



7056 



7057 



1368 



7058 



7059 



715 
1036 



527 



527 



431 



469 



1178 



7060 



90 



Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine , V^Val ine , 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide ins ertion) 

VPVFEAVGRIYELRIiMMDFDGKNRGYAFVMYCHKHEAKRAVREL" 
NNYE IRPGRLLGVCCS VDNCRLFI GG I PKMKKREE I LEE IAKVT 

EGVLDVI VYASAADKMKNRGLRLRGVRE PPRGCHWLGRKLI AWX 
ASSLWG 



KKMNLAE I CDNAKKGRE YALLGNYDSSMVYYQGVMQQ IQRHCQS" 
VRDPAIKGKWQQVRQELLEE YEQVKS I VGTLES FKIDKP PDFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
SRS PGTCRPSr\ PI SKS EKPS TSRDKDYRARGRDDKGRKNMQDG 

ASDGEMPKFDGAGYDKDLVEALERDI VSRNPS IHWDDIADLEEA 
KKLLREAGVL PMWM 

SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTIi 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAAROGPRR 



SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGI LSMFAARQGPRR 

GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
RRCRWDAMEYDEKLARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTGCYYRCHSKCIiNLISKPCVSSKVSHQAEYELNICPETGLDSQ 
DYRCAECRAP I / CS/DGWPSEARQCDYTGQYYCSHCHWNDLAV 
IPARVVHNWDFEPRKVSRCSMRYLAIiMVSRPVLRLRE IN 



DS RRVS WRS WLANE / WGKHLCLF I WLS MNVLLFW KTF LliYNQGP - 
EYHYI^QMLG/ALCLSRASASVLNLNCSIiILIjPMCRTLIiAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGATI CIFSGVHVAAHLVNALN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 



M 



DSRRVSWRSWLANE/WGKHLCLFIWLSMNVLiFWKTFLLYNQGP 
EYHYLHQMLG/ ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGATI CIFSGVHVAAHLVNALN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 



M 



GIYLHVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
S PQERI SEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLI FHQRTHTGETYFQCTI CKKAFLR 
SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHQRSHLGKKPFQ * P VTKLS FPIS IS QPSHKNTQLHQEELCLR 
GYPC 



FSGFGAVPDALGCRMSDLRITEAFLYMDYLCFRALCCKGPPPAR 

PEYDLVCIGLTGSGKTSLLSKLCSESPDNWSTTGFSIKAVPFQ 

NAUjNVKELGGADNIRKYWSRYYQGSQGVIFVLDSASSEDDLEA 
ARN*SCTQLLQHPQLCTLPFLILA 



1670 



WPAFPRQPAAAAMDALLGTGPRRARGCLGAAGPTSSGRAARTPA" 
APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 
CYLSFPDSHSGCLGDTQFSFRMRQCGGQRSPWHADDRHYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLSLIAPEYFDKLAPCLEAVCSEIDQWPAPAPGQTLNLPVM 
GVWQVRI PSRVDKSESSPPKQFDQENLLPAPWLASVHELDLF 
RCFRPVLTHMQTLWELMLLGE PLLVLAPS PDVSSEMVLALTS CL 
QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNVVIjGVTNPFFIK 
TLQHWPHI LRVGEP KMSGDLPKQVKLKKPFKV* RPWDTKP 




S VNLP PSL WP WEEAMD S TKSE PL KGS PEAEDGN I EYKKLVNPS Q 
YRFEHLVTQMKWRLQEGRGEAVYQIGVEDNGLLVGLAEEEMRAS 
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SEQ 

± U 

NO: 

■ 

f 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, ; 
H=Histidine, I = Isoleucine, K==Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, WValine, 
W=Tryptophan, Y»Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=spossible nucleotide insertion) 




• 




LKTLHRMAEKVGADITVLREREVDYDSDMPRKITEVLVRKVPDN 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QSGRTSSI S FE I IjGFNS KGEVHGINGTQWGQTLRMGW * * * 
RT*DGGRVWRLFEIV*MNALRGL*TSSAPLRKSMGNQLN*IKNG 
VKI KRQGHPGNGLGPGNSEGVGRAGRRH* GP WALGQWNYSDSR 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
SANTGIAGTTREHLGIxAIjALKVPFFIWS 

QLERVLKQPGCHKVPMIiVTSEDDAVTAAQQFAQS PNVTP I FTLS 
SVSGESLDLLKVFLNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR*IDLLATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 

■ 


710 


ARMPSPLGPPCLPVMDPETTLEEPETARLRFRGFCYQEVAGPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQRPGS PEEAAALVEGLQHDP * ARMPS PLGPPCLPVMDPETTL 

i 

EE PETARLRFRGFC YQE VAGP REALARLRE L CCQWLQ PEAHS KE 
QMLEMLVLEQFLGTLPPEIQAWVRGQRPGSPEEAAALVEGIiQHD 
PGQLLG 


7062 


71 


744 


AKAGTNIiERLHWLSYFFCIPKHKLKSSQKDKVRQFMACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGRYKDPQDENKIGVDGIQQFCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQELKDTAKFKD 
FYQFTFTFAKNPGQKGLDL*MAGAYWKLVLSGRFKFLYLWNTFIi 
MEHH 


7063 

* 


2 


562 


LRTVPDLPGRRFRANRTGQRR * PELPPDMNSLEQAEDLKAFERR 
LTE Y I HCLQ PATGR WRMLL I WS VCTATGAWNWLI DPETQKVS F 
FTSLWNHPFFTI SC1TLIGLFFAGIHKRWAPS I IAARCRTVLA 
EYNMSCDDTGKIiILKPRPHVQ*QSSLIVMGLKIAFLRISDTAKS 
HKGFLLRLDM 


7064 


300 .- 

* ■ £ 
* 

• 


\ 884 1 

• 'w » 


RDTGSDPSSTRRLCSTCCTGH*PABPIASPHPSRGTCPPASSAS 
SI^TGC^TCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC . 
S SRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPS PKSAAPJ 
FJjIj TPLiGAGRAGGSRANS 


7065 

» 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG. 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNN1ADPEEL 
FTKLERIGKGSFGEVFKGIDNRTQQWAIKIIDLEEAEDEIEDI 
QQE I TVLSQCDS S YVTKYYGS YLKGSKLWI IME YLGGGSALDLL 
RAGPFDEFQ 


.7066 


356 


616 


PG PQ RG PWRAREGGHPLDP ADHPRAPAS LRSNVRAATMMQ I CDT 
I KHb Jjr JNAWNK1? iVsAVWWMUy 1 VM V PoIjJjKJJVFIjADPGIaDND 
VGVEVGGSGGCLEERTPP 


7067 

• 


152 

» 


973 


KEN I TMATE I GS PPRFFHMPRFQKQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAGYYNDLVPP IGMLNNPt4NAVTTKFVRTSTNKVKCPVFVVRW 
I i'lsUKKJjV J CjAb SGlir rjjWWGIjI F W r E 1 1 LQAHDS P VRAMTWSH 
NDMWMLTADHGG YVKYWQSNMNNVKMFQAHKEAI REARFIHNI P 
FSWPIVMVKLFSKCILGAEMHGLCQFLGNFLHPINTIFFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVIJ.LFLALCSAKPFFSPSHIALKNMMLKDMEDTDDDDD 
DDDDDDDDDDEDNS LFPTRE PRSHFFPFDLFPMCP FGCQCYS RV 
VHCSDLGLTSVPTNIPFDTRMLDLQNNKIKEIKENDFKGLTSLY 
GLILNNNKLTKIHPKAFLTTKKLRRLYLSHNQLSEIPLNLPKSL 
AELR I HENKVKKI QKDTFKKK 


7069 


114 7 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAKQ 
TLKDKTGTDSNSTESSETSTGSLCKESFSGQVSSSSLMPLTPFW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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j SEQ 
ID 
NO: 

■ 


Predicted 
bea i nil i ncr 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresDOndincr 
to first 
aniino acid 
residue of 
amino acid 
seauence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HasHistidine . Ts=Isoleucinp» TfssT»ve;inG 

Ii=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=G1 ut amine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Cddon. /=DOssible nucleotide deletion 
\=possible 'nucleotide insertion) 








EKTKKGRKDKAKKSKTKMPSLVKKWQSIQRELDEEDNSSSSEED 
RVSTAQKR I EEWKQQQLVSGMAERNANFEA 


7070 


1 


547 


DGTMSDS EAVQ RATAL I EQRLAQEEENEKLRGDARQKLPMDLLV 
IiEDEKHHGAQSAAIjGKVKGQER VRKX*SLDLRRE I IDVGG IQNLI 
ELRKKRKQKKRDALAASHE P PPE PEE I TGPVDEETFLKAAVEGK 

ATVDPQ 


7071 


2 


921 


ARGTIjRALETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTELNS VPQKSS PFLTRVPAYPPHSENI QY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PBSSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 

Q T .P FP VTOQT .TVi V VQ VA CCi P PQP PP TT VP T .PP P P fY3 WT .TfP t?n 

IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7072 


2 


921 


ARGTLRALE T AKKVGKVGANGQKAAG P S ADS VTENK I GS PP KTP 
VSNVAATSAGPSNVGTELNS VPQKS S P FLTRVPAYPPHSENIQ Y 

FfiHPP TOT PFP VPO VPOTflW PPPPTVP L\(Z\721 DPVDP FVP 

PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSUX3YYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7073 


50 


504 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSIiSPQFRKPLGRAPAMP 

TArPVPinA/TT/SVPmTriWTGT aTJOT7\rP(t!T?l?OT7/~ , V'nP , r'\7Ti r KT r T t V > OV > T 
JjVJtx JLXCJxV ViJuo I KLwuJMoiiHriyi? V JiUL>r oCioXJJJr 1 v clvi 1 X oKJL 

VTLGKDEFHLHLVDTAGQDEYSILPYSFI I GVHG YVLVYS VTS L 

HSFQVIESLYQKLHEGHGK 


7074 


263 

_/ 

P 


1003 


VCPVLCSTRQEPGHS S LVTYFGKPTRRKEFLLGHCIAAGKMNI S 

VL/JjA XXM XADIiVJjy VuAV lX^XM><OlXi\X\I v lJNX'l^I\JJ£v^ 

CALLNSGGGVI KAE I ENEDYS YTKDG IGLDLENSFSNILLFVPE 
YT^FMQNGNYFL IFVTCSWSLNTSGLRITTIiSSNLYIOiDITSAlW 
/ MNATAAIiEFLKDMKKTRGRLYXRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKIjTFTES THVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVERIMKKTEESESQ 
vppPTTTPKT/nnifPwr'QTvnpTPPr.QpaGifK'r'r.TH'T pnr mstinun 

vdrCi XVXCXx v v^XV.rCCl^o X X ^ lr X Jr IrXJorv-iO IVXVV^XjXrlXjliX/Xj^Krl^Ky 

" AITLNESTGPLLRTS IHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7076 


« 279 


1049 


LQSESSNAAEGNEQRHEDEQRS KRGGWS KGRKRKKPLRDSNAPK 
S PLTG YVRFMNERREQ LRAKR P E VP FP E I TRMLGNE W S KLP PEE 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH i 
RQDAARQATHDHEKETEVKERS VFDI PI FTEEFLNHS KAREAEL 
RQLRKSNMEFEERNAALQKHVESMRTAVEKLEVDVIQERSRNTV 
LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


ELKWLDMFSNWDKWLSRRFQKVKLRCRKGIPSSLRAKAWQYLSN 
S KELLEQNPRKFEELERAPGDPKWLDVI EKDLHRQFP FHEMFAA 
RGGHGOODT • YR T LKAYT I VP P DEG YCOAO AP VAAVT J^MHMPAPO 

AFWCLVQICDKYXPGYYSAGLEAIQLDGEIFFALLRRASPLAHR 
HLRRQRIDPVLYMTEWFMCIFARTLPWASVLRVWDMFFCEGVKI 
I FRVALVLLRHT1XS VEKLRSCQGM x^TMEQLRNLPQQCMQEDF 
LVHEVTNLPVTEALIERENAAQLKKWRETRGELQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFILLAKGTSGSALTALISQVLEAPG 
VWFGELLELAOTQEl^GAI^YLQLMLFAYGTYPDYIANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQARKKRRGIIEKRRRDRINSSLSEIxRRLVPTTiFEKQGSSKIjEK 
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SEQ 
ID 

NO: 

» 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

( A=ZX~1 aril np CoCvste i n** r*>_7\ a-na r*t~ i c\ Ai^i R— 
Glutamic Acid. F= Ph en vl alanine . G=Glvcinp 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline. Q=Glutamine, R=Arainine. 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








AEVLQMTVDHIiKMLHATGGTGTHAIiLFQASFrQQIF 


7080 


200 


595 


VQLPLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPQDGIPY 
LTHPLCHQDWSVGRIjQIRALATPGHTQGHLVYLIiDGEPYKGPS 
CLFSGDLLFLSGCGEFPRKREELGEEGETETOAATVPWRALKP " 


7081 


213 


506 


AVTEEEMILNSI^LCYHNKLILAPMVRVGTLPMRLLALDYGADI 
VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 
RWFQMGTS 


7082 

: 


3 

■ 


1137 

X -A. J / 

• 


IX P ^PTVPTMT.M a WnjfJDVT .T JT ,DOm T?T.TJr* T /2ni?PI^T?r , R t> C V 
■rUrO JviV 1 1 T J JJrJrarV \~i\\JF V iiDLiinuwiJu 1 War JjxlLriAay aftf ouAKo Ju 

CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 

SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 

QAQSATEVEERHVSPSCSTSRBRPFQAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFQKT VGKFPGG T T.P PnTfOVMT ,PP P 

ALEDYWLMKRGTAITFPKDINM ILSMMD INPGDTVLEAGSGSG 
GMSLFl^KAVGSQGRVISFEVRKDHHDLAKKOTKHWRDSWKLSH 
VEEWPDNVDFIHKDISGATEDI KSLTF1?AVALDMLNPHVTLPVF 
Y PHLKHG G VC P VYWN I TOVI E LLD 


7083 


115 


541 


RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQQLLSEP 
SPKAPRARPCRVSTADRSVRKGIMAYSLEDLLLKVRDTLMLADK 
PFFLVLEEIXSTTVETEEYFQAIjAGDTVFMVIiQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 

4 


3 


; 522 


NS VSVSSQSRFLASVPGTGVQRS AAADMAASTAAGKQR I PKVAK. 
VKNKAPAEVQITAEQLLREAKERELELLPPPPQQKITDEEELND 
YKLRKRKTFEDNIRKNRTVISNWIKYAQWEESLKEIQRARSIYE 
RALDVDYRNI TLWLKYAEMEMKNRQVNHARNI WDRAITTL 


7085 

» 


243 

« ■ 
* 


1499 

• 


RQLARLRRRG WRSPFGGAPMAH I T I NQ YLQQ VYEAI DSRDGAS C 

YAVGNHDF I EAYKCQTVIVQSFLRAFQAHKEENWALPVMYAVAL 
DLRVFANNAT^QLVKKGKSKVGDM 

AG T ED S FCKWGMT iFTA/WOT .FKT YPTf T XTK" T .HT .PK" PT .TP & T n<2 GAIT .1? 

DDYSTAQRVTYKYWGRKAMFDSDFKQAEEYLSFAFEHCHRSSQ 
KNKP^ILIYLLPVKMLLGHMPTVELLKKYHLMQFAEVTRAVSEG... 
NLLLLHEALAKHEAFF I RCG I FX I LEKLKI I T YRNLFKKVYLLL 

KTHQLSLDAFLVALKFMQVEDVDIDEVQCILANLI YMGHVKGYI . 
SHQHOKLWS KONPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNSKLRPEVMQDLLESTDFTEHEIQEWYKGFIjRDCP 
SGHIiSMEE FKKI YGNFFP YGDASKFAEHVFRTFDANGDGT I DFR 
EF 


7087 


166 


723 


LSGSSAGKVAAPCVPP SNHELVPI TTENAPKNWDKGEGASRGG 
NTRKSIiEDNGS TRVTPS VOPHLOP I RNMS VS RTMEDS CE LDLVY 
VTER I IAVS FPS TANEENFRSNLRE VAQMLKSKHGGNYLIjFNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKI CS I CKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 

• 


GTSAASPSSLLEMAGEITETGELYSSYVGLVYMFNLIVGTGALT 
MPKAFATAGWLVSLVLLVFLGFMSFMTTTFVI EAMAAANAQLHW 
KRMENLKEEEDDDSSTASDSD^IRDNYERAEKRPILSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDLAIYA 
AAVPFSLMQVTCSATGNDS CGVEADTKYNDTDRCWGPIjRRVD 


7089 

• 

• 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLiNVARTYIPNTKVEC 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRPTOELVTLEEAIXSGSDILLVVPKATVI^NQIjDESQQERNDLM 
QLKLQLEGQVTEIjRSRVQELERALATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVELD 
RLRDTVKALraEQEKLLGQLKEVQADKEQS EAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDECVAQMKDTLGQAQQRVAELEP 
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SEQ 

TT"\ 

ID 

NO: 


Predicted 
oegxnmng 
nucleotide 
location 
corresponding 
to first 
amino acid 
resiaue or 
amino acid 
sequence 


Predicted end 
nucieociue 
location 
corresponding 
to first 
amino acid 
residue of 
ammo acia 
sequence 


Amino acid segment containing signal peptide 
iA«Aianme , c=3*jysteine, D=Asparcic Acia, e= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
w=±ryptopnan, x^ryrosine, x= unKnown , *=t>cop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








LKEQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LB VAE VNGKLAELGLHLKEEKCQWS KERAGLLQS VEAEKDKI LK 
LSAEI LRLEKAVQEERTQNQVFKTELAREKDS SLVQLSES KRE& 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTOSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 

• 


1775 

* • 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLPPGTMPSASDWIGI FKVEAACVRDYHTFVWSSVPESTTDG 
S P IHTS VQ FQAS YLP KPGAQLYQFRYVNRQGQVCGQSP P FQFRE 
PRPMDELVTLEEAIXKSSDILLWPKATOuQNQIiDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERD I LSRQQGDHVAR I LELEDDIQT I SEKVLTKE VELD 
RLRDTVKALTREQEKLLGQLKE VQADKEQS EAELQVAQQENHHL 
l^LKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAEIiEP 
LKEQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LEVAEVNGKIiAEIjGLHLKEEKCQWSKERAGLLQS VKAEKDKI LK 
LSAEILRLEKAVQEERTQNQVFKTELAREKDSSLVQLSESKREL 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATT EDEEAA v G LS CPAAJjTDS EDE S PEDMRJjHPMAFVS VETQ 
ASLLLGLE 


7091 


186 


1076 

* 


EGMLTREHRCGRSEEQELEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSL vpvyi ys pe YVSMCDSLAKI PKRA5MVHSLIEAY 
ALHKQMRIVKPKVASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 

WSGGWHHAKKDEASGFCYLNDAVLG ILRLRRKFERILYVDLDLH 
HGDGVEDAFSFTSKVMTVSLHKFSPGFFPGTGDVSDVGLGKGRY 
YS VNVP IQDGIQDEKYYQ I CERYE PPAPNPGL 




CIO 


C% f\ f\ 

809 


KQG I NEDQE E S QKFRjjGEGCEP I S KKQMKKb I KQ KQWE EQRELR 
KQKRKEKRKRKKLERQCQMEPNSDGHDRKRVRRDVVHSTLRLI I 
DCSF0XLM 


7093 


454 


655 

"v — 


NFGVSGVELAQQASMVRMS FVIAACQLVLGLLMTSLTESS IQNS 
ECPQLC VCEIKPWrUPQb 1 YKEA 


7094 


1 2 


508 


FVRSMHWGVGFAS SRPCWDLSWNQS I SFFGWWAGSEEP FS FYG 
DI IAFPLQDYGGIMAGLGSDPWWKKTLYLTGGALLAAAAYLLHE 
LLVI RKQQE IDS KDAI I LHQFARPNNGVPS LS P FCLKMET YLRM 
ADLP YQNYFGGKLSAQGKMPWIEYNHEKVSGTEF 1 1 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
SLECVSHEVDSHYCPSCLENMPSAEAKLK^CNRCANCFDCPGCMH 
TLSTRATSISTQLPDDPAKTTMKKAYYLACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
SAMSPAPDAAPAPASISLFDLSADAPVFQGLSLVSHAPGEAIiAR 
APRTSCSGSGERESPERKLLQGPMD1SEKLFCSTCDQTFQNHQE 
QREHYKLDWHRFNLKQRLKDKPLLSALDFEKQSSTGDLSSISGS 
EDSDSASEEDLQTLDRERATFEKLSRPPGFYPHRVLFQNAQGQF 
L YAYRCVLGPHQDPPEEAELLIjQNLQS KGPRDCVVLMAAAGHFA 
GAI FQGRE WTHKTFHRYTVRAKRGTAQGLRDARGGPSHSAGAN* 

T .R T? YNE ATT .YKHVR DT »T . ARP WAKAT i R R ART T T tT iRA PR SGR S L F 
FGGKGAPLQRGDPRLWD I PLATRRPTFQELQRVLHKLTTLHVYE 
EDPREATOLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGLETNNSNSELPLRVGLKVAQGSPLMGGQVSA 
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SEQ 
ID 
NO: 


Predicted 

fa^cci nn i net 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
'sequence : 


Predicted end 

ntir^T ^ ot~ i rip 

location 
corresponding 
to first 
amino acid 
residue of 

amino acid 
sequence 


Amino acid segment containing signal peptide 

\A a Aleuiiucp \~*a\*y a Lclilc t U^nopax L X G /\CjLQ # J2t — 

Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 

Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


• 




• 


SNS FS RLHCRNANEDWMSALCPRLWDVPLHHLS I PGSHDTMTYC 
LNKKSPISHEESRLLQLLNKAIiPCITRPVVLKWSVTQALDVTEQ 
LDAGVRYLDLRIAHMLEGSEKNLHFVHMVYTTALVEDTLTEISE 
WLERHPREVVIIiACRNFEGLSEDLHEYLVACIKNIFGDMLCPRG 

c v tr i orv.vjr^^; viva XCJJjaooljjCrCxlrllSJjWJrVj vjf X WWOnrtVA 

TEALIRYLETMKSCGR 


7098 


82 

* 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVMRliQT 
E ARSGFWAPNRFP VN I CRMTAVDGDRGGSSRETCRCHFHPS LEA 
IiVLLLQDWQPGGVGI CTS FLGISV7ALLDYHRALRTCLPSKPLLG 

LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
T JiWVW T iOG TD PMPT"J P <! Q P WT .VR VTVA?T T .V P Q W PKT\/& Pf2P TP fTJ 

AI IHFAFLLSDS ILLVATWVTHSSWLPSGI PLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQVD 


i 7099 


992 

* 

■ 


210 

■ 


LFRLAPGFLRSliARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
A R ^ LPP ^ P A P X>d PMD A T »T •f3P WT}PP RDR\7PJ\ OP PP 1TC P P PflD/ZA n 

GAVLE VHVPQ I G AGVS LPG I LAAKCGAEVI LSDS S ELPHCIiEVC 

RQSCQMl^PHl^WGLTWGHISWDLIJ^PPQnilLASDVFFEP 

EDFED I LAT I YFLMHKNP KVQLWSTYQVRS ADWSLEALLYKWDM 

KCTVH T P LiES PD ADICR n T A R c; TTjPfiP HTVPIvlT i V T <J P ATTR*; T . 
xv v« vnxr i injt x/ax/xveix/ xx~ve«o ijjcvjiui x v &l jxj v j. or/UM/ou 


7100 


205 


671 


ANGGFT^EAAPGSEVSLPLOTPTASHSKTTALGIGSAPPPHLSVL 
FLFS FP PQLGDPLEAFP VFKKYDRKGLNVS I ECKRVSGLEPATV 
DWAFDLTKTNMQTMYEQSEWGWKDREKREEMTDDRAWYLIAWEN 

i?u v r v i^iif i^r i/ v uAvUjj v ui n • 


7101 


| 2 


503 


WRGGPRPJlKRlxAGGAVGWVLLVRGVHSVPJ\GGGRPPPJ^ADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 

VTSRWIPI/INERTDKDSRLPLILGGlfKSDLVEYSR 


7102 

* 

i 


2 


503 


WRGGPPJvAKRlxAGGAVGWVLLVRGVHSVRAGGGRPPR7W)MKKD 
{ VRILLyGEPRVGKrSLIMSLVSEBFPEEVPPPJ^EITIPADyTP . 
ERVPTHIVDYSl^QsbEQLHQEISQAWICIVYAyNNKHSIDK 
VTSRW I PLI1STERTDKDSRLPLILGGNKSDLVE YSR 


7103 


119 


438 

* 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES 

T.CinTf n QPT iW Q PT) A WP7WT ."XVTPPP YARn TTT ,MTM/P\/I?tf ATnD 
jjoiJx\vjoC(i_ixvcy& r L»riv vru vjjav x Jrx>£*i>\o^x. xxjinx/ v f v c luAiyr 

DELSS CGWNKKE KYS SAP 


7104 


1670 


795 


RLWEHRSVSAGASGWGLSS PGCLLLHPSLPEEERVDIIjINNAGV 
MRCPHWTTEDG FEMQFGVNHLGEAWAGAAPWVQAI IiPRRP PKVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH I DFDDLNWOTRKYNTKAAYCOS \ KIiAI VLFTKEtiS RRIjO 
GSGVTVNAIiHPGVAJ?TELGRHrciHGSTFLQHHN\WAHLLAAWS 
KSPRSWPAPAQHNTLAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREQPLPR 


7105 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGLQDPQCLALFRVAVDKHQA 
LLKAAMSGQGVDRHLFAL Y I VSRFLHLQS PFLTQVHSEQWQLS T 
SQIPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDG 
M ITFH I S S KKS STKTDSHRLGQHIEDALLDVAS LFQAGQHFKRR 
FRGSGKENSRHRCGFLSROTGASKASMTSTDF 


7106 


14 


1064 


GLQAGH PH PRSASR I PEAD TH \ YSKLQRAFDS I VNKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNXAYIQITFVEPYFDEYEMKDRV 
TYFEKNFKTLPJ^FMYTTPFTLEGRPRGELHEQYPJvNTVLTTMHAF 
PYIKTRISVIQI<EEFVLTPIEVAIEi;MKKKTlX3I^VAINQEPPD 
AKMLQM VLQGS VGATVNQGPLEVAQVFLAE I PADPKLYRHHNKL 
RLCFKEFI MRCGEAVEKNKRLI TAIX3RE YQQELKKITOWKLKENL 
RPMIERKI PELYKP I FRVESQKRDSFHRSS FRKCETQLSQGS 


7107 


1145 


591 


*I*WLQTGKKK 
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SEQ 
ID 
NO: 


Predicted 

Vi^o i nni Tier 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 

UUVf X CO L> luc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

/ 7\ — 7\ 1 an i no P\^o Hp - ! no T>— - J\ ana ^ <-» Jk^n#^ it> 
^-VXcUJXIie | w s ^jrb(.CXXlC/ L/ ~.rlop oi LlC nClQ; C» = 5 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=: Leucine , M=Me thionine , N=*Asparagine , 
P=5 Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *»Stop 
Codon, /^possible nucleotide deletion, . 

\=puasiuic uucj.coi.iue iiioci lidh ) 


7108 


1 

* 


942 


VKVALLLTOLEQPRTESEWENSOTLKMFLFQI^LNSSTFYIAF 
FLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGIIMVLK 
QTWl^l^I^YPLIQNWWTRRKVRQEHGPERKISFPQWEKDYNL 
QPMNAYGLFDEYLEMILQFGFTOIFVAAFPLAPLLALLNNIIEI 
RLDAYKFVTQWRRPLASRAKDIGIWYGIXJ3G1GILSVITNAFVI 
AI TSDFI PRLVYAYKYGPCAGQGEAGQKCMVGYVNASLS VFRIS 

Ur £>ivKaak'&jV\JZ>tlji? ovjI Jr-Ulvx V_a, iKiJxKjJr'i J noljV Jtr ILx I i lis*'? 

WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEAI^IQLQPKE 
TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKLATDTSTFEATSEGTIiELQQRNPKAERLRWSPAQEESFRQM 

V V ±rHSJ2tLf x^J\.JSJJcitL\^£3E,\^\slS.l r x llMonb V vn^K vrlbuo i ivv 

SDCGKTFKQSSNIX^HQRIHTGEKPFECNECX3KAFRWGAHLVQH ! 

QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 

KAYGWCSELIRHRRVHARKEPSH 


711 0 

/XXV 




D3 / 


DT.nMT?CnWT.VFV^TfPT?T3'HT'\7TrDT,Vr>PVPr.\7Vr\MT.TT?aQT r P'D , \7T P 
KJjJUIi r abr Ju V & v I xYCi&KiTX V tvtr jU X JJit I KLiv j\yi v JJLi iXnol 1 Jr V IjVj 

SPSTKRRGQMLQP I IEGETAHFFEE X KEEEEDGVNLSSELGDML 
KTAVQVQSSLKNSESDVEENQEKIaALDLRLSSSRAASMPELLEQ 

YKKIKAKLRLLEVDISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEQLQRDRRKV 
MEENN I VHQARFFRRQTD SSG KEWOTTNNTYWRLRAE PGYGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 

T?V"MTVYriTr\T GT7tIT?C!\n/^n?07V YVTTOMTT VTT7 r rOT7nOr»trVO TTVTT 7X1 

r AJNl>yiJ 1 yijis iSHr t> Vxx V tS\^AJ\X VoM 1 X V 1 oLUobKlc XlM X AXv 

KYGGEKI DVTVS VYKHGE KI PDMAPPQQAKPKLI PASASAAGQ 


7113 


1 

• 


824 


KCLRQAWHEAPSSIx^FTRWCSREERAEGGGNLHRS ITRDPKPPG 
LRPS QRPMDDKKKKRS P KP CLAQPAQAPGTLRRVPVPTSHSGSL 
ALGLPHLPS P KQRAKFKRVGKEKGRP VLAGGGSGSAGTPLQHS F 
LTEWDVYEMEGGLLl^LNDFHSGRLQAFGKECSFEQLEHVREM - 
QEIGxARlxHFSLDVCGEEEDDEEEE^Vra^ 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRR 


7114 

■ 


3 . 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQECKICRKI 
IYIiNTDFVSVKQRLPK^SWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWK7CLHYNLHKAQPAERFFDPNQRGKALHQKQALRKSQRS 

SQKSTLIAHQRTHTGEKPYECSECX3KTFIQKSTLIKHQRTHTGE 
KPFVCDKCPKAFKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RI HTS E KPQCS EHGKASDEKPS PTKHWRTHTKENI YECSKCGKS 
FRGKSHLSVHQRIHTGEKPYECSICX3KTFSGKSHLSVHHRTHTG 
EECPYECRRCGKAFGEKSTLIVHQRMHTGEKPYKOSrECGKAFSEK 
SPLIKHQRIHTGERPYECTDCKKAFSRKSTIjIKHQRIHTGEKPY 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
lui^Kon x oX/ivivxi 


7115 


1 


947 


NAAHGYNWGLWCMY 1 1 PPQDWLDRGDESAP IRTPAMIGCSFWD 
REYFGD IGLLD PGMEVYGGENVKLGMRWQCGGSMEVIjPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEWMDDFKSHVYMAWNI PM 
SNPGVDFGDVSERLALRQRIiKCRS FKWYLENVYPEMRVYNNTLT 
YGEVRNS KASAYCLDQGAEDGDRAI LYPCHGMSSQLVRYSADGIi \ 
LQI^PLGSTAFLPDSKCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSGPIVSRATGRCLEVEMSKDANFGI^LVVQRCSGQKWMIRN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT \ 
PGSVIlWLSIimRETOHLRDRNSGSSSSIiNrTLPSTSAWSSIR 
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SEQ 
NO: 

• 
• 

* 

* 


Predicted 
oegmning 
nucleotide 
location 
corre sponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
ami no aciu 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
f=rronne, Q=Giutamine, x=Argimne, 
S=Serine, T=Threohine, V= Valine, 
W=Tryptophan, Y=: Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«poseible nucleotide insertion) 








ASNYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSLAHELWKVP 
LPPKNITAPSRPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGSSWGBSSSGRITNWLVLjKNLTPQIDGSTLRTLCMQHGPLIT 
FHLNLPHGNALVRYSSKEE WKAQKSLHI SDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGALPQGA 
FVSQAARAI PLLQPSQAAQAEGLSQPARACGALCSLPWPLRNWG 
SPILRLPGGLRTPTOTRKTRTRSAMACWARAQWDTLGPLKLSHR 
G KVCLRH PR PTGVRGG PGAAGRQGGMGTRRRGT FTSGARD PGGL 
RVKHRCQPTGHLP 


7118 

■ 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHS I VRLVAFC PFAS S Q VALENANAVSEG WHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEBLGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSACKAQLGI^HSYSRAICVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILGAEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYIANKCSIASRIDCF. 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRXiKKEKKRLAALALASSEMSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMBDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 

t 

» 

i 


49 

V 

I 


1863 

> * 
t 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHS I VRLVAFC PFASSQ VALENANAVSEG VVHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRGVTUjHFHNLVKGLTDLSACKAQI/SLGHSYSRAKVlCFNVNRVD 
NM I IQS I SLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCMAQFIGNI^IELNEDKLEKLEELTMDGAKAKAI LDASRS SMG 
MDISAIDLINIESFSSRVVSLSEYRQSLHTYIjRSkMSQVAPSLS 
AL I GEAVGARL I AHAG S LTNLAKY PAS TVQI LGAE KALFRALKT 
RGNTPKYGL I FHSTFIGRAAAKNKGR I SRYLANKCS IASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKlCRLKKEFbCRIiAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSrSFSKPKKKKSFSKEEL 
MSSDLEETAGSTS I PKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QIX3TRRCLRGDKVTNAMQDFLVTNLEPRFIEPQTANLSVVFKDS 
NSTTPLIFVLSPGTOPA^LYKFAEEMKFSKKLSAISLGQGQGP 
RAEAMMRSSIERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 
RDFRLWLTSLPSNKFPVS ILQNGSKMTI E PPRGVRANLLKS YSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICISQLKMFLDEYDDIPYKVLKYTAGEINYGGRVTD 
D WDRRC I MN ILED F YNPDVLS PEHS YS ASG I YHQ I P P TYDLHG Y 
LSYIKSLPLNDMPEIFGLHDNANITFAQNETFALLGTIIQLQPK 
SSSAGSQGREE I VEDVTQNILLKVPEP INLQWVMAKYPVLYEES 
MNTVLVQEVIRYNRLLQVITQTLQDLLKALKGLVVMSSQLELMA 

IPAVFWISGFFFPQAFLTGTLQNFARKFVISIDTISFDFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCP I YKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRH W I KRGVAL I CALDY 


7121 


2 


546 


RPLR P WVLS LGS M VGLMT YGRRQ FQS LDTTMRRL I P P FRE ASAK 
LTTLVDADAEAFTAYLEAMRLPKNTPEE KDRRTAALQEGLRRAV 
SVPLTIAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVF 
G AYFNVL INLRD I TDEAFKDQ IHHRVS S LLQE AKTQAALVLDCL 
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SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first . 
amino acid 
residue or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ammo acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanme, C=Cysteme, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
w s= Tryptophan, Y= Tyros me , X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ETRQE 


7122 


2 


546 


RPLRPWVLSLGSMVGLMTYGRRQFQSLDTTMRRLIPPFREASAK 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGI.RRAV 
SVPLTLAETVASLWPALQELARCGNLACRSDLQVAAKAIjEMGVF 
GAYFNVLINLRDITDEAFKDQIHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


7123 


1 

* 

■ 


1092 

> 


KPAVPEARSAGTSEAGRSGAEEVSCGSVSGDGAAMRLTPRALCS 
AAQAAWRENFPLCGRDVARWFPGHMAKGLKKMQSSLKLVDCIIE 
VHDARIPLSGRNPLFQETLGLKPHLLVLNKMDLADLTEQQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ 1 1 PMVTELIGRSHRYHRKENL 
EYCIMVIGVPNVGKSSIiINSLRRQHLRKGKATRVGGEPGITRAV 
MSKIQVSERPLMFIiDTPGVLAPRIESVETGLKLALCGTVLDHL 
VGEETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLGKTQKVK\^TGTGNVNVIQPNYPAAARDFLQTFRRGLLGSW 
LDLDVLRGHPRV 


7124 


2 


382 


LPLTIJJLAAPFAHLLLPPGHDQSPCWHPGPALSPGTLGPLSWAM 
ANSGLQLLGYFLALGGWVGI IASTALPQWKQSS YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 

i 


156 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCX3SSESRGVNESHKSE 
FIELRKWLKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMIIS 
LPESCLLT\RDTVIRSYLGAYITKWKPPPSPLIiALCTFLVSEKH 
AGHRSIiLEA\YLEILPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 
RAHVQEFFASSRDFFSSLQPLFAEAVDSIFSYSALLWAWCTVNT 
RAVYL\SPGSGNAFLQSRTPVQLAPYLDLLNHS PHVQVKAAFNE 
ETHS YEIRTTSRWRKHEEVFI CYGPHDNQRLFLE YGFVS VHNPH 
ACVYVSRGWNQLCS 


7126 


1 

4- 


733 

* 


CRDMAAFI VPS PARRCSQKGS LGHLPTQPWLWAAMSPRGQERGT 
SHSQAREPQRPGRWLLGSLQSSPGTLGQAGTASRRRGCMVQRWV 
QVATGRRAVQVPKGALGLALGETSPGASRGMSGGAGGCWALGWA . 
PSPVLPSWLLEGPPPWLS 1 1 SDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 

» 


GLPAMCST*KAGYYEETEGDC I PKDR* IEKRPFKEI * RRI PRI F 
AKQKQ I * S *NSQKIGASE IDRGRKEADCSDAPAAAR IGAVSVFR 
RSTQEARVSPRSNAKSANLRAVRAD * WEHFVLLFHTP EQFLAEC 
ICRST* *K* WHQLC* PLSSL*TGLKRKLLL* VLFRI * WLKDCDV 
* FCQKI FATNFCNWQNLIQ* EE * KPVEYSVEN*HIMNLL»LPM*Ii 
CQSSLRDQTIVTWRM*RNYSMFRINMISSL*DGSIHIPLKLHFY 
PALIFTLTVPINSCCQRPLPLFAHQS IKTLASSGS PMLACLRFL 
LVKKRAFIHTPRSPGCSV* CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALREXiSQIEAELNKHWRRLLEGLSYYKPPSP 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDERHPYRVE YADCVDKLEKELVS KYRQQ FEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEIIFLYYAYFEMAPSD 
LLVLTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSALILVE 
GMDIESLHKCALDDRRELHQFAQDGL ICQDMDCLMLT FGD I PHH 
AP VIjIiAW ALiLtKii I JbN PEfci I o aWRKX GGTAI QLNVF QILTRLiLiw 
SLASGGNDCTTSTACMCVYGLLS FVLTSLELHTLGNQQDI IDTA 
CEVLADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RALVSGKS TAKKVYS FLDKMS FYNEL YKHKPHDVISHEDGTLWR 
RQTPKLLYPLGGQTNLRIPQGTVGQVMLDDRAYLVRWEYSYSSW 
TLFTCE I EMLLHWSTADVI QHCQRVKP 1 1 DLVHKVI STDLS I A 
DCLLPITSRIYMIjLQRLTTVISPPVDVIASCVNCLTVLAARNPA 
KVWTDLRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGNLLMNSE 
QPQGEYGVTIAFLRIilTTLVKGQLGSTQSQGIiVPCVMFVbKEML 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to : first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=»Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon,- /=pbssible nucleotide deletion, 
\=possible nucleotide insertion) 


■ 






PS YHKWRYNSHGVREQIGCLI LELIHAI LNLCHETDLHSSHTPS 

LQFLCICSLAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 

QGQGQLL I KTVKLAFS VTNNV IRLKPPSNWS PLEQALS QHGAH 

GNNLIAVLAKYIYHKHDPALPRLAIQLLKRLATVAPMSVYACLG 

NDAAAI RDAFLTRLQS K\ I E \DMR I K\ VMI L\ E FLTVA\ VETQP 

GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWELIDSQQ 

QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 

SPLFGTLSPPSETSEPSILETCALIMKIICLEIYYWKGSLDQP 

LKDTLKKFS I EKRFAYWSGYVKSLAVHVAETEGSS CTSLLEYQM 

LVSAWRMLL 1 1 ATTHAD IMHLTDSWRRQLFLDVLDGTKAIiLLV 

PASVNCLRLGSMKCTIiLLILLRQWKRELGSVDEILGPLTEILEG 

VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 

CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 

RDGVCVLGLHLAKELCEVDEDGDS WLQVTRRLP ILPTLLTTLEV 

SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGI TQS I CLPL 

LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 

LRYNFLPEALDFVGVHQERTLQCLNAVRTVQS LACLEEADHTVG 

FILQLSNFMKEWHFHLPQLMRDIQVNLGYLCQACTSFIiHSRKML 

QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 

ASEQQALHTVQYGLLKILSKTLAALRHFTPDVCQILLDQSLDLA 

E YNFLFALS FTTPTFDS EVAPS FGTLLATVNVALNMLGELDKKK 

EPLTQAVGLSTQAEGTRTLKSLLMFTMENCFYLLISQAMRYLRD 

PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 

SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 


7129 

* 


1 

. . . . » ■ - ' 


1054 

T ■ * 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGEKMILIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQS FNAWNYTNRSGDAPLTVNEL 
GTAYVS ATTGAVATALGLNALTKHVS PL I GRFVPFAAVAAANCI 
NI PLMRQRELKVG I PVTDENGNRLGESANAAKQAI TQVWSRIL 
MAAPGMAI PPF IMNTLE KKAFLKRFPWMSAP I QVGLVGFCLVFA • 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPEIiRRVYFNKGL 


7130 


2 

• 

* 


780 

■ 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG 
ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLLSLHRSS 
RCESHQDLLPDIADSHQMTEKLSDLTLQDSQKVVVVNRNLPIjN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNS PRTPKKPVNSKLGLSPYLTP 
YNDSDKLNDYLWRGPSPNQQNIVQSLREKFQCLiSSSSFA 


7131 


805 


573 


AAAEGH IE WKFL I EACKVNPFAKDRWGNI PLDDAVQFNHLE W 
KLLQDYQDS YTLS ETQAEAAAEALS KENLESMV 


7132 


1420 


1087 


IDMLLLSGALVSGPYTLITTAVSADLGTHKSLKGNAHALSTVTA 
I IDGTGSVGAALGPLLAGLLS PSGWSNVFYMLMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 

m 


2 


3648 


QQI PGLLPAHGESGDALRKPRLQKP I TGHLDDLFFTIiYPSLEKF 
EEELLELHVQDHFQEGCGPLDGGALEILERRLRVGVHNGLGFVQ 
RPQVWLVPEMDVALTRSASFSRKWSSSKTSSGSQALVIiRSRL 
RLPEMVGHPAFAVIFQLEYVFSSPAGVDGNAASVTSLSNLACMH 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAGISHLEADLSQTSLVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
E I LDANKQPAEAVSATE PVTFNPQKEESDCLQSNEMVLQFLAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTHILVPVSRDGTFDAGSPGFQLRYMVGPGFLKPGERRCFA 
RYLAVQTLQ I DVWDGDS LLL I GSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
NO: 

■' r 
< 


Predicted 

V>c»o "i Tin i ncr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

llU^X C(JU JLQc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
\A=iu.aiiine, ^acysteine, jj=Hspsrcic Aciu r K— 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Xysine, 
L=Leucine, M=Metnionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S= Serine, T-Threpnine, V=* Valine,.. 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
'Vpossible nucleotide insertion) 








LEWATEYEQDNMWSGDMI^FGRVKPIGVHSVVKGRIiHLTLAN 
VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHWQAQKIiADVDSEIiAAMLLTHARQGKGPQDVSRBSDATRRRK 
LERMRS VRLQEAGGDLGRRGTS VIAQQSVRTQHLRDLQV I AAYR 
ERTKAES IASLLSLAI TTEHTLHATLGVAEFPEF\7LKNPHNTQH 
TVTVEIDNPELSVIVDSQEWRDFKGAAGLHTPVEEDMFHLRGSL 
afqlylkfhetahvf fkfqs fsag q lamvqas pglsnekgmdav 

SPWKSSAVPTKHAKVLFRASGGKPIAVIjCLTVELQPHVVDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLKVASGPSPEIKDFFVUYSDRWLATPT 
QTWQVYLHSLQRVDVSCVAGQLTRLSLVLRGTQTVRKVRAFTSH 
PQELKTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNLVDVD 
CHQLVAS WLVCLCCRQPL I SKAFE IMLAAGEGKGVNKR IT YTNP 
YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 
\jjEttSrt±,SjX i J.JNiJiaJc*JJi\lNiS£*Ar L.VJ\V_. X v? 


\ 7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYEEGLIDNSG 
LRLFYTMDIRKYDAGVIEAGLOTSLFHTIPPGMPEFQSEGHCTL 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQYLKEEQTILPGDl^LITECRYNTKDRAEI^T 
WGGLSTRSEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEI 
YRPVTTWPFIIKSPKQYKNLSEMDAMNKFKWTKKEGLSFNKIiVL 
SLPVNVRCSKTDNAEWS IQGMTALPPD I ERPYKAEPLVCGTSSS 
SSLHRDFS INLLVCLLLLS CTLSTKS L 
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2 

3 

i ~ 

* 

• 


2072 

« 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSYLIFRAASESDGRC 
WLDALEI1AI1RCSSLLRLGTCKPGRDGEPGTSPDASPSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 
ENKS LMWTLLKQLRPGMDLSRWLPTFVLEPRS FLNKLSDYYYH 
'ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG . 
ETFRCC^FHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
GSITAKSRFYGNSLSALLDGKArLTFLNRAEDYTLTMPYAHCKG- 

ilygtmtleixk;kvtiec_aknnfqaqlefklkpffggsts inqi 
sgkitsgeevlaslsghwdrdvfikeegsgssalfwtpsgevrr 
qri_rqhtvpleeqteleserlwqhvtraiskgdqhratqekfal 

PUR^BADTkDDDAPCT MIT lTVtVT>/"\T WtTT Pill TiPADTittrvr tvnntto - iMT\n 

lilsAyKyKAKnKyfibJ-WlHWAPyij^ HJ^PITQEWHYRYEDHSPWDP 
LKDIAQFEQDGILRTLQQRAVARQTTFLGSPGPRHERSGPDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQELHRHLSAMLSSTARAAQA 
PTPftT .T .0 ^ PP "FT .T .r*\/PT . a ITU .JPTKTP T T . V 
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418 


DFVPSFRRPSGNTSQ_VWIiLPJ_ATLEKEVAGLREKIHHLDDMLK 

SQQRKVRQMIEQLQNSKAVIQSKDATIQELKEKIAYLEAENLEM 

HDRMEHLIEKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 
TPWFT 

J_Xv V V Hi J. 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERITCQRAQMRLRRQKXGVVPFLGDFLTELQRLDSAI 

P n DT .nfiNTWTfR ^ TCP VP VT .HP MOT ,T .CWTTi A MNTVP T.PPT .V , VTT\7 7 VVX? r r 

RMEQLSDKESYKLSCQLEPENP 


7138 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGVVPFLGDFLTELQRLDSAI 
PDDLDGNTNKRSKEVRVLQEMQLIjQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7139 


1 


357 


SLRNSARGLKMAASAARGAAALRRSINQPVAFVRRIPWTAASSQ 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALQ 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1357 


RASSLQVLKAWGGLI PSS FX2QQHTGQYALEELFDLKVYDCFCSF 
NMNVSLEKQLRPSQPWPRGKCRKTPGWEEARPKAQDLRGDLGKT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Prol ine , Q=Glut amine , R=Arginine , 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyros ine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAE AHTRGP PRLPAATGCP PHLPGLLSG I S VD I DPTGLQSQ 

WTPKGQDPPLMFSEDYQKSLLBQYHLGLDQKLRKyvVGEI>IWNF 
ADFMTNQCG 


7141 


124 

• 


1073 

■ 


LDSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VT P EKPLRRGLSHRSD PNAVAPAPOGVRLSLG PL <5 PF KT ,P F T T .n 

EANRLAAQLEQCALQDRESAGEGLGPRRVKPSPRRETFVLKDSP 
VRDLLPTVNSLTRSTPS/LKQPDASTPE* * *EGVSQGSPGYIWK 
EALQHEEGVTHLQSVPCIQKPS I FSS \SRSTPPVRGRAGPSGRA 
AASEETRAAKIiRGAAAKSSCQLP IPSAI PRPASRMPLTSRS VPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


658 


839 


LIFLMLHMELKMLSSVTIiHIRAFLYWICLKPTSCLIFQNVLNLL 
KK*SRAVGVWVMCRT/YSSDLQVGVIKPWLLLGSQDAAHDLDT 
LKKNKVTHILNVAYGVENAFLSDFTYKS IS ILDLPETNILS YFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTSSSKNIAYNCCWDQCQACFNSSPDLADHIRSIHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCVVGGCNA 
SFASQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSKAGIV1NKRR 
KLKNKRRRSLARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
HSWFHSTVSIIjLFFQIKYKTLQKNISTIISKSLKI 


7144 


1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
R CPAPRPAGVS YVIRDE VEKYNRNGVNALQLDPALNRLFTAGRD 
S I IRI WSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLI SASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKALAYAKDKELVASAGLDR. 
Q I FLWDVNTLTALTASNNTVTTS SLSGNKDS I YSLAMNQLGTI I 
VSGSTEKVLRWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 
G S S DGT I RL WS LGQQRC I ATYRVHDEGVW ALQVNDAFTHVYSGG 
RDRKI YCTDLRNPD I RVL I CE 



TRADOCS:1416260.1(%CSK0l!.DCX:) 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 

e 

of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:l-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 

• i * * 

wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 



6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 

» 

operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. = A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

♦13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 

and. 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

&) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; . ■ . 

b) amplifying a product comprising at least a "portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

• - 

the sample.. 

■ 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the , polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1 786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1786 and 3573-5358, an active 
domain of SEQ ID NO: 1-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 

■ 

group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

* * 1 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

» 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

4 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

T 

T 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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